Hatena::�֥���(Diary)

ablog ���Υڡ����򥢥��ƥʤ��ɲ� RSS�ե����� Twitter

2017-04-24

pyenv �� Python �ΥС������������ؤ��ƻȤ�

pyenv ���Ȥä�ʣ���ΥС��������� Python �����󥹥ȡ����������ؤ�����ˡ�����⡣


pyenv �Ȥ�

ʣ���С��������� Python �����󥹥ȡ������ƥ��ޥ��ɰ�ȯ�ǻȤ��С������������ؤ��뤳�Ȥ��Ǥ����ġ��롣

pyenv lets you easily switch between multiple versions of Python. It's simple, unobtrusive, and follows the UNIX tradition of single-purpose tools that do one thing well.

This project was forked from rbenv and ruby-build, and modified for Python.

f:id:yohei-a:20170424184508p:image

pyenv does...

  • Let you change the global Python version on a per-user basis.
  • Provide support for per-project Python versions.
  • Allow you to override the Python version with an environment variable.
  • Search commands from multiple versions of Python at a time. This may be helpful to test across Python versions with tox.
GitHub - pyenv/pyenv: Simple Python version management

pyenv �����󥹥ȡ�������

$ brew install pyenv
  • ~/.zshrc �˰ʲ����ɵ�
export PYENV_ROOT="$HOME/.pyenv"
export PATH="$PYENV_ROOT/bin:$PATH"
eval "$(pyenv init -)"
$ source ~/.zshrc

Python �����󥹥ȡ�������

$ pyenv install --list
$ pyenv install 3.5.0
$ pyenv install 2.7.13

�С������������ؤ���

pyenv global <�С�������> �Ȥ��������Τˡ�pyenv local ��<�С�������> �����Ȥ��Υ��������ǥ��쥯�ȥ���ȿ�Ǥ����롣���١����ꤹ���ȱ�³�������롣

$ pyenv versions 
  system
* 2.7.13 (set by /Users/******/.pyenv/version)
  3.5.0
  • 3.5.0 �����ؤ���
$ pyenv global 3.5.0
$ pyenv version
3.5.0 (set by /Users/*******/.pyenv/version)
  • 2.7.13 �����ؤ���
$ pyenv global 2.7.13
$ pyenv version
2.7.13 (set by /Users/******/.pyenv/version)

�Ķ�


����

NoDB: Efficient Query Execution on Raw Data Files

1990ǯ�夫�黺�߽Ф������ǡ����̤������夬���������Ƥ�������ʬ�ϲ�ǽ�ʥǡ����̤��礭���Ѥ��äƤ��ʤ���

���ظ��ǡ����١����ʤɤ˥����ɤ��������̥ǡ�������®��ʬ�ϤǤ��뤬�������ɻ��֤��������ä���2010ǯ�������ä�DB�������ǡ�����CSV�ե������ʤɡˤ������ɤʤ��ǥ����ɺѤߤ˶ᤤ��ǽ���Ф����椬�ʤ�����

Oracle Database��PostgreSQL��MySQL �ʤɤǤ�DB�������ե�������ɽ�Ȥ��ư�����ǽ�Ϥ��Ǥ˼��������Ƥ��롣

�����γ���ɽ�ǤϤʤ�I/O�̺︺�Τ��ᡢSelective tokenizing/parsing/tuple formation��Adaptive indexing (positional map)��Tuple caching �Ȥ��ä����椬�����Ƥ��롣

Redshift Spectrum �� Redshift �ǹ��٤�ʬ�Ϥ��Ĥĥ����ɤ������ʤ�/�������ʤ�S3���Υǡ����� Redshift ���Υǡ����Ȱ����˰��Ĥ�����������ʬ�ϤǤ����褦�ˤʤä���Redshift �����Ϥ����������褫���Ĥġ�ʬ���оݥǡ������ϰϤ�S3�ޤǹ����ä���


PostgreSQL �����Ϥ������������Ѥ��� Presto �����������ꤲ�� Tresure Data �� Prestogres �ȥ��ץ����������Ƥ����Ȼפ���

Athena ���ܳ�Ū�� DWH �Ȥ��������ϰʲ��Τ褦��ʬ�����٤��㤤�ǡ����򥢥ɥۥå���ʬ�Ϥ����桼����������������

������������JOIN �����Ȥ���ʣ���������������̥ǡ�����Ĺ���ֽ��������ˤ� Redshift �Τۤ��������Ƥ��롣

�Ǹ��� yokatsuki �ˤ��� Redshift Spectrum ��������¤�ιͻ������Ѥ��ޤ���

�ǽ��˷���

  • Redshift Spectrum��S3����������ǽ�ϡ�΢��Athena�����Ѥ��Ƥ���
  • Redshift Spectrum�ϡ�Athena��Redshift��Ϣ�Ȥ�����ʬ�μ����Ǥ���
����ǽ��Redshift Spectrum�٤�������¤���ͻ����� �� Developers.IO

����

2017-04-21

Amazon Redshift ����³���ξ���

���饹�������Ф���Ʊ���˲�ǽ�ʥ桼������³�ο��� 500 �Ǥ����ܺ٤ˤĤ��Ƥϡ���Amazon Redshift Cluster Management Guide�פΡ����饹�����ؤ���³�פ򻲾Ȥ��Ƥ���������

Amazon Redshift �ˤ��������� - Amazon Redshift

2017-04-20

���ظ��ǡ����١����Υڡ����Υǡ�����¤

�Իظ��ǡ����١����Ϲ�ñ�̤ǥڡ���(Oracle Database �Ǥ����ǡ����֥��å�)�˥ǡ�������Ǽ���Ƥ����Τ��Ф��ơ����ظ��ǡ����١������󤴤Ȥ˥ڡ����˳�Ǽ���Ƥ��롣�������¹Ի��˷��̥��åȤ��֤��ݤ����̤˥Х��Х��Υڡ����˳�Ǽ�����Ƥ����ǡ������ɤ����äƥ��ץ�(�쥳����)���������Ƥ���*1�Τ��Ȼפä������Ϥ���ID�Τ褦�ʤ��Τ����äƤ����褦����

��ID �� C-Store �Ǥ� pid��MonetDB �Ǥ� BAT(Binary Association Tables) �� oid �ȸƤФ��Ƥ��롣

The Design and Implementation of Modern Column-Oriented Database Systems

f:id:yohei-a:20170420230742p:image:w640

  • NSM(N-ary Storage Model): �������ǥ֥��å��ʥڡ����ˤ˥ǡ�������Ǽ��������
  • DSM(Decomposition Storage Model): �������ǥ֥��å��ʥڡ����ˤ˥ǡ�������Ǽ��������

f:id:yohei-a:20170420233150p:image:w360

C-Store �� pid �� MonetDB �� oid �Τ褦�� ID �ʳ��� Join Index �Ȥ�����ˡ�⤢�롣

����Ū�ʼ����� C-Store �Υ�������������Ĵ�٤뤳�Ȥ��Ǥ��롣


�����˾ܤ����Ͻ��������ΰʲ��Υ����ȥ��ǾҲ𤵤��Ƥ���

VLDB 2009 Tutorial on Column-Stores *1 ��ɽ�����줿�Ȥ����ФäƤ��ޤ����錄���ζ��ʽ��Ǥ������ʽ��פȸ��äƤ��ޤä� appengine ja night #21 *2��

Google BigQuery�ʤɤλ��Ȥߤ��Τꤿ���Ȥ������ظ��ǡ����١����������� - wmo6hash::blog

��������饤�ɻ��ȡ�


https://www.cs.duke.edu/courses/fall01/cps216/lectures/10-physical.pdf���⥷���ץ����ɤ�������

NSM = N-ary Storage Model

DSM = Decomposition Storage Model


��­

C-Store �Ϥ��Υޥ����롦���ȡ����֥졼�������������������ظ��ǡ����١����ǡ�����Ū�ˤ��������Ƥ��� Vertica(HP��) �Υ롼�ĤǤ���

���ȡ����֥졼���������Ȱ����ˤ�����ʸ���񤤤Ƥ��� Daniel Abadi �����Ͼ��� VLDB 2009 Tutorial on Column-Stores ���񤤤Ƥ��뤦���ΰ��ͤǤ⤢�ꡢʬ�����䤹���������񤫤��Ƥ����Τ��ץ����å��Ǥ���


MonetDB �ˤĤ��Ƥϰʲ�����


���ȡ����Х����������ǥå���(�����ǥå��������꡼��������)�Ȱ㤦�Τϰʲ��������Ȼפ��ޤ���


����

Amazon Aurora �� MySQL Client on macOS ������³����

���󥹥ȡ���

$ brew install mysql --client-only
  • �С�����������ǧ����
$ mysql --version
mysql  Ver 14.14 Distrib 5.7.18, for osx10.12 (x86_64) using  EditLine wrapper

Amazon Aurora �����󥹥���������ư����


��³���Ƥߤ�

  • Amazon Aurora ����³����
$ mysql -h mydbinstance-cluster.cluster-******.us-east-1.rds.amazonaws.com -u master -p
Enter password: 
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 55
Server version: 5.6.10 MySQL Community Server (GPL)

Copyright (c) 2000, 2017, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
mysql> show databases;
+--------------------+
| Database           |
+--------------------+
| information_schema |
| mysql              |
| performance_schema |
+--------------------+
3 rows in set (0.20 sec)
mysql> create database test;
Query OK, 1 row affected (0.17 sec)


mysql> show databases;
+--------------------+
| Database           |
+--------------------+
| information_schema |
| mysql              |
| performance_schema |
| test               |
+--------------------+
4 rows in set (0.16 sec)
mysql> use test;
Database changed
mysql> create table test(col1 int, col2 varchar(255));
Query OK, 0 rows affected (0.18 sec)
mysql> insert into test(col1, col2) values(1, '  Work hard, Have fun, Make History!!  ');
Query OK, 1 row affected (0.17 sec)

mysql> insert into test(col1, col2) values(2, 'Work hard, Have fun, Make History!!');
Query OK, 1 row affected (0.17 sec)
  • insert �����Ԥ���ǧ����
mysql> select * from test;
+------+-----------------------------------------+
| col1 | col2                                    |
+------+-----------------------------------------+
|    1 |   Work hard, Have fun, Make History!!   |
|    2 | Work hard, Have fun, Make History!!     |
+------+-----------------------------------------+
2 rows in set (0.17 sec)
  • �⤦1�� insert ���Ƴ�ǧ���Ƥߤ�
mysql> insert into test(col1, col2) values(1, '  Work hard, Have fun, Make History!!                 ');
Query OK, 1 row affected (0.16 sec)

mysql> select * from test;
+------+--------------------------------------------------------+
| col1 | col2                                                   |
+------+--------------------------------------------------------+
|    1 |   Work hard, Have fun, Make History!!                  |
|    2 | Work hard, Have fun, Make History!!                    |
|    1 |   Work hard, Have fun, Make History!!                  |
+------+--------------------------------------------------------+
3 rows in set (0.18 sec)

����

Github �Υ��ݥ��ȥ�������������ˡ

f:id:yohei-a:20170420175506p:image:w640

  • ���ݥ��ȥ�̾�����Ϥ���"I understand the consequences, delete this repository" �򥯥��å�

f:id:yohei-a:20170420175456p:image:w640


����

git-secrets �� Github �� AWS �� Credential ���������Ƥ��ޤ��ʤ��褦�ˤ���

git-secrets �Ȥ����ġ����ǡ����ä� AWS �� Credential �� GitHub �˸������Ƥ��ޤ��ʤ��褦�ˤ��롣


$ brew install git-secrets
  • ��¸�ץ��������Ȥ˥եå�����
$ cd ~/Documents/github/yoheia
$ git secrets --install
&#10003; Installed commit-msg hook to .git/hooks/commit-msg
&#10003; Installed pre-commit hook to .git/hooks/pre-commit
&#10003; Installed prepare-commit-msg hook to .git/hooks/prepare-commit-msg
  • �����������Ԥ�
$ git secrets --register-aws --global
  • ������������ǧ����
$ git secrets --list
secrets.providers git secrets --aws-provider
secrets.patterns [A-Z0-9]{20}
secrets.patterns ("|')?(AWS|aws|Aws)?_?(SECRET|secret|Secret)?_?(ACCESS|access|Access)?_?(KEY|key|Key)("|')?\s*(:|=>|=)\s*("|')?[A-Za-z0-9/\+=]{40}("|')?
secrets.patterns ("|')?(AWS|aws|Aws)?_?(ACCOUNT|account|Account)_?(ID|id|Id)?("|')?\s*(:|=>|=)\s*("|')?[0-9]{4}\-?[0-9]{4}\-?[0-9]{4}("|')?
secrets.allowed AKIAIOSFODNN7EXAMPLE
secrets.allowed wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY

�Ķ�


����

Amazon Redshift �Υץ饤�ޥꥭ��


�祭�����󡢰������󡢳������������ϰ�ȿ���Ƥ⥨�顼�ˤʤ��ʤ��Τ����ա�

CREATE TABLEʸ�����������������̤ˤǤ����ΤǤ�������ȿ���Ƥ����ǡ�����INSERT���褦�Ȥ��Ƥ⥨�顼�ˤϤʤ餺INSERT�Ǥ��Ƥ��ޤ��ޤ����ʤ�NOT NULL�����˰�ȿ���������Ϥ������ȥ��顼�ˤʤ��ޤ���

�ץ饤�ޥꥭ�������ӳ������������������� - Amazon Redshift

���줫��Amazon Redshift���Ϥ��뵻�ѼԤ����դ��٤������ĤΥݥ����� �� Developers.IO

���������ץ饤�ޥꥭ���������ӳ��������������Ͼ����󶡤Τߤ���Ū�Ȥ��Ƥ��ꡢAmazon Redshift �ˤ��äƶ��פ����뤳�ȤϤ����ޤ��󡣤��������ץ饤�ޥꥭ���ȳ��������ϥץ����˥󥰻��Υҥ��ȤȤ��ƻ��Ѥ����ޤ������ץꥱ���������� ETL �ץ������ޤ���¾�β��餫�Υץ������ˤ��äƤ������Υ����������������פ����������ϡ��������Υ�������������ɬ�פ������ޤ���

���Ȥ��С��������ץ����ʡ��������������׻��ǥץ饤�ޥꥭ���ȳ������������Ѥ��ơ������������������ص��Ѥ˱ƶ���Ϳ���������������ӻ��ȴط������������ꡢ¿���η������ؼ������ꡢ��Ĺ�ʷ��������򤷤��ꤷ�ޤ���

�ץ����ʤϤ������Υ����δط������Ѥ��ޤ�����Amazon Redshift �ơ��֥��Τ��٤ƤΥ����������ɻ���ͭ���Ǥ��뤳�Ȥ������Ȥʤ��ޤ������ץꥱ����������̵���ʳ��������ޤ��ϥץ饤�ޥꥭ�������Ĥ������硢�����Ĥ����������������ʷ��̤��֤���ǽ���������ޤ������Ȥ��С��ץ饤�ޥꥭ�������դǤʤ����硢SELECT DISTINCT ����������ʣ�����Ԥ��֤����Ȥ������ޤ���ͭ�����ɤ����狼���ʤ������ϡ��ơ��֥����Ф��ƥ������������������ʤ��Ǥ���������������ͭ�����Ȥ狼�äƤ��������ϡ��ץ饤�ޥꥭ�������������������Ӱ�������������ɬ���������Ƥ���������

Amazon Redshift �ϡ�NOT NULL ����������Ŭ�Ѥ��ޤ���

���������� - Amazon Redshift

����

*1���̡��Υڡ����˳�Ǽ�����Ƥ��륫�����Υǡ��������礷���쥳�����ʥ��ץ��ˤ˲��򥭡������������Τ���

2017-04-18

AWS���󥽡����Υǥե����ȥ꡼�����������ꤹ��

URL�ѥ��᡼�����꡼�����������ꤹ�����ɤ����ʲ������������꡼�����������ꤷ�Ƥ��롣

https://console.aws.amazon.com/console/home?ap-northeast-1#

�꡼�������� �꡼�������ȥ��٥����ӥ��ƥ��������� - Amazon Elastic Compute Cloud �dz�ǧ�Ǥ��롣


����

https://console.aws.amazon.com/console/home?region=ap-northeast-1

�Τ褦���꡼���������ꤷ���֥å��ޡ������ȤäƤ��ޤ������꡼���������������ƥ����󥢥��Ȥ��Ƥ����Ǥ⡢�����֥å��ޡ����������ȥ����󥤥����˰տ��̤������꡼�������Υ��󥽡��뤬ɽ�������Ƥ��ޤ���

AWS(Amazon Web Services) - AWS�ǥե����ȤΥ꡼�������������ˤ�����(29291)��teratail

Mac ���� Amazon Redshift �˥��ޥ��ɥ饤���ġ�����psql�ˤ���³����

PostgreSQL�����󥹥ȡ���

$ brew install postgresql
$ psql --version
postgres (PostgreSQL) 9.6.2

�ȤäƤߤ�

  • ��³����
% psql "host= matrix.***.ap-northeast-1.redshift.amazonaws.com user=neo dbname=zion port=5439"
Password: 
psql (9.6.2, server 8.0.2)
SSL connection (protocol: TLSv1.2, cipher: ECDHE-RSA-AES256-GCM-SHA384, bits: 256, compression: on)
Type "help" for help.
  • DB������ɽ������
zion=# \l
                  List of databases
     name     | owner | encoding | access privileges 
--------------+-------+----------+-------------------
 dev          | rdsdb | UNICODE  | 
 padb_harvest | rdsdb | UNICODE  | 
 template0    | rdsdb | UNICODE  | rdsdb=CT/rdsdb
 template1    | rdsdb | UNICODE  | rdsdb=CT/rdsdb
 zion         | neo   | UNICODE  | 
(5 rows)
zion=# \d
         List of relations
 schema |   name    | type  | owner 
--------+-----------+-------+-------
 public | customer  | table | neo
 public | dwdate    | table | neo
 public | lineorder | table | neo
 public | part      | table | neo
 public | supplier  | table | neo
(5 rows)
  • ��λ����
zion-# \q

����

2017-04-16

Amazon Redshift ���ȤäƤߤ�

Redshift ���饹��������������


���饤��������JackDB�ˤ�����³����


JackDB����Redshift�ؤΥ������������Ĥ���


SQL���¹Ԥ��Ƥߤ�

select * from pg_class limit 100;

f:id:yohei-a:20170416191306p:image


���塼�ȥꥢ��: �ơ��֥��߷פΥ��塼�˥��� - Amazon Redshift ����Ƥߤ�

�����ȥ�����ʬ�����������̤����Ѥ��ʤ�������
CREATE TABLE part 
(
  p_partkey     INTEGER NOT NULL,
  p_name        VARCHAR(22) NOT NULL,
  p_mfgr        VARCHAR(6) NOT NULL,
  p_category    VARCHAR(7) NOT NULL,
  p_brand1      VARCHAR(9) NOT NULL,
  p_color       VARCHAR(11) NOT NULL,
  p_type        VARCHAR(25) NOT NULL,
  p_size        INTEGER NOT NULL,
  p_container   VARCHAR(10) NOT NULL
);

CREATE TABLE supplier 
(
  s_suppkey   INTEGER NOT NULL,
  s_name      VARCHAR(25) NOT NULL,
  s_address   VARCHAR(25) NOT NULL,
  s_city      VARCHAR(10) NOT NULL,
  s_nation    VARCHAR(15) NOT NULL,
  s_region    VARCHAR(12) NOT NULL,
  s_phone     VARCHAR(15) NOT NULL
);

CREATE TABLE customer 
(
  c_custkey      INTEGER NOT NULL,
  c_name         VARCHAR(25) NOT NULL,
  c_address      VARCHAR(25) NOT NULL,
  c_city         VARCHAR(10) NOT NULL,
  c_nation       VARCHAR(15) NOT NULL,
  c_region       VARCHAR(12) NOT NULL,
  c_phone        VARCHAR(15) NOT NULL,
  c_mktsegment   VARCHAR(10) NOT NULL
);

CREATE TABLE dwdate 
(
  d_datekey            INTEGER NOT NULL,
  d_date               VARCHAR(19) NOT NULL,
  d_dayofweek          VARCHAR(10) NOT NULL,
  d_month              VARCHAR(10) NOT NULL,
  d_year               INTEGER NOT NULL,
  d_yearmonthnum       INTEGER NOT NULL,
  d_yearmonth          VARCHAR(8) NOT NULL,
  d_daynuminweek       INTEGER NOT NULL,
  d_daynuminmonth      INTEGER NOT NULL,
  d_daynuminyear       INTEGER NOT NULL,
  d_monthnuminyear     INTEGER NOT NULL,
  d_weeknuminyear      INTEGER NOT NULL,
  d_sellingseason      VARCHAR(13) NOT NULL,
  d_lastdayinweekfl    VARCHAR(1) NOT NULL,
  d_lastdayinmonthfl   VARCHAR(1) NOT NULL,
  d_holidayfl          VARCHAR(1) NOT NULL,
  d_weekdayfl          VARCHAR(1) NOT NULL
);
CREATE TABLE lineorder 
(
  lo_orderkey          INTEGER NOT NULL,
  lo_linenumber        INTEGER NOT NULL,
  lo_custkey           INTEGER NOT NULL,
  lo_partkey           INTEGER NOT NULL,
  lo_suppkey           INTEGER NOT NULL,
  lo_orderdate         INTEGER NOT NULL,
  lo_orderpriority     VARCHAR(15) NOT NULL,
  lo_shippriority      VARCHAR(1) NOT NULL,
  lo_quantity          INTEGER NOT NULL,
  lo_extendedprice     INTEGER NOT NULL,
  lo_ordertotalprice   INTEGER NOT NULL,
  lo_discount          INTEGER NOT NULL,
  lo_revenue           INTEGER NOT NULL,
  lo_supplycost        INTEGER NOT NULL,
  lo_tax               INTEGER NOT NULL,
  lo_commitdate        INTEGER NOT NULL,
  lo_shipmode          VARCHAR(10) NOT NULL
);
  • �ǡ��������ɡʰ��̤ʤ���
copy customer from 's3://awssampledbuswest2/ssbgz/customer' 
credentials 'aws_access_key_id=<Your-Access-Key-ID>;aws_secret_access_key=<Your-Secret-Access-Key>' 
gzip compupdate off region 'us-west-2';

copy dwdate from 's3://awssampledbuswest2/ssbgz/dwdate' 
credentials 'aws_access_key_id=<Your-Access-Key-ID>;aws_secret_access_key=<Your-Secret-Access-Key>' 
gzip compupdate off region 'us-west-2';

copy lineorder from 's3://awssampledbuswest2/ssbgz/lineorder' 
credentials 'aws_access_key_id=<Your-Access-Key-ID>;aws_secret_access_key=<Your-Secret-Access-Key>'
gzip compupdate off region 'us-west-2';

copy part from 's3://awssampledbuswest2/ssbgz/part' 
credentials 'aws_access_key_id=<Your-Access-Key-ID>;aws_secret_access_key=<Your-Secret-Access-Key>'
gzip compupdate off region 'us-west-2';

copy supplier from 's3://awssampledbuswest2/ssbgz/supplier' 
credentials 'aws_access_key_id=<Your-Access-Key-ID>;aws_secret_access_key=<Your-Secret-Access-Key>'
gzip compupdate off region 'us-west-2';
  • �ơ��֥��������̤��η����������ɤ��줿���Ȥ���ǧ���롣
select count(*) from LINEORDER;
select count(*) from PART;
select count(*) from  CUSTOMER;
select count(*) from  SUPPLIER;
select count(*) from  DWDATE;
select stv_tbl_perm.name as table, count(*) as mb
from stv_blocklist, stv_tbl_perm
where stv_blocklist.tbl = stv_tbl_perm.id
and stv_blocklist.slice = stv_tbl_perm.slice
and stv_tbl_perm.name in ('lineorder','part','customer','dwdate','supplier')
group by stv_tbl_perm.name
order by 1 asc;
table         mb
customer 224
dwdate    160
lineorder  34,346
part          120
supplier   104
-- Query 1
-- Restrictions on only one dimension. 
select sum(lo_extendedprice*lo_discount) as revenue
from lineorder, dwdate
where lo_orderdate = d_datekey
and d_year = 1997 
and lo_discount between 1 and 3 
and lo_quantity < 24;

-- Query 2
-- Restrictions on two dimensions 

select sum(lo_revenue), d_year, p_brand1
from lineorder, dwdate, part, supplier
where lo_orderdate = d_datekey
and lo_partkey = p_partkey
and lo_suppkey = s_suppkey
and p_category = 'MFGR#12'
and s_region = 'AMERICA'
group by d_year, p_brand1
order by d_year, p_brand1;

-- Query 3
-- Drill down in time to just one month 

select c_city, s_city, d_year, sum(lo_revenue) as revenue 
from customer, lineorder, supplier, dwdate
where lo_custkey = c_custkey
and lo_suppkey = s_suppkey
and lo_orderdate = d_datekey
and (c_city='UNITED KI1' or
c_city='UNITED KI5')
and (s_city='UNITED KI1' or
s_city='UNITED KI5')
and d_yearmonth = 'Dec1997'
group by c_city, s_city, d_year
order by d_year asc, revenue desc;

f:id:yohei-a:20170416195611p:image


�����ȥ�����ʬ�����������̤����Ѥ��륱����
drop table part cascade;
drop table supplier cascade;
drop table customer cascade;
drop table dwdate cascade;
drop table lineorder cascade;
  • �ơ��֥����ƺ���������ʬ�������������ȥ���������
CREATE TABLE part (
  p_partkey     	integer     	not null	sortkey distkey,
  p_name        	varchar(22) 	not null,
  p_mfgr        	varchar(6)      not null,
  p_category    	varchar(7)      not null,
  p_brand1      	varchar(9)      not null,
  p_color       	varchar(11) 	not null,
  p_type        	varchar(25) 	not null,
  p_size        	integer     	not null,
  p_container   	varchar(10)     not null
);

CREATE TABLE supplier (
  s_suppkey     	integer        not null sortkey,
  s_name        	varchar(25)    not null,
  s_address     	varchar(25)    not null,
  s_city        	varchar(10)    not null,
  s_nation      	varchar(15)    not null,
  s_region      	varchar(12)    not null,
  s_phone       	varchar(15)    not null)
diststyle all;

CREATE TABLE customer (
  c_custkey     	integer        not null sortkey,
  c_name        	varchar(25)    not null,
  c_address     	varchar(25)    not null,
  c_city        	varchar(10)    not null,
  c_nation      	varchar(15)    not null,
  c_region      	varchar(12)    not null,
  c_phone       	varchar(15)    not null,
  c_mktsegment      varchar(10)    not null)
diststyle all;

CREATE TABLE dwdate (
  d_datekey            integer       not null sortkey,
  d_date               varchar(19)   not null,
  d_dayofweek	      varchar(10)   not null,
  d_month      	    varchar(10)   not null,
  d_year               integer       not null,
  d_yearmonthnum       integer  	 not null,
  d_yearmonth          varchar(8)	not null,
  d_daynuminweek       integer       not null,
  d_daynuminmonth      integer       not null,
  d_daynuminyear       integer       not null,
  d_monthnuminyear     integer       not null,
  d_weeknuminyear      integer       not null,
  d_sellingseason      varchar(13)    not null,
  d_lastdayinweekfl    varchar(1)    not null,
  d_lastdayinmonthfl   varchar(1)    not null,
  d_holidayfl          varchar(1)    not null,
  d_weekdayfl          varchar(1)    not null)
diststyle all;

CREATE TABLE lineorder (
  lo_orderkey      	    integer     	not null,
  lo_linenumber        	integer     	not null,
  lo_custkey           	integer     	not null,
  lo_partkey           	integer     	not null distkey,
  lo_suppkey           	integer     	not null,
  lo_orderdate         	integer     	not null sortkey,
  lo_orderpriority     	varchar(15)     not null,
  lo_shippriority      	varchar(1)      not null,
  lo_quantity          	integer     	not null,
  lo_extendedprice     	integer     	not null,
  lo_ordertotalprice   	integer     	not null,
  lo_discount          	integer     	not null,
  lo_revenue           	integer     	not null,
  lo_supplycost        	integer     	not null,
  lo_tax               	integer     	not null,
  lo_commitdate         integer         not null,
  lo_shipmode          	varchar(10)     not null
);
  • �ǡ��������ɡʰ��̤�����
copy customer from 's3://awssampledbuswest2/ssbgz/customer' 
credentials 'aws_access_key_id=<Your-Access-Key-ID>;aws_secret_access_key=<Your-Secret-Access-Key>' 
gzip region 'us-west-2';

copy dwdate from 's3://awssampledbuswest2/ssbgz/dwdate' 
credentials 'aws_access_key_id=<Your-Access-Key-ID>;aws_secret_access_key=<Your-Secret-Access-Key>' 
gzip region 'us-west-2';

copy lineorder from 's3://awssampledbuswest2/ssbgz/lineorder' 
credentials 'aws_access_key_id=<Your-Access-Key-ID>;aws_secret_access_key=<Your-Secret-Access-Key>'
gzip region 'us-west-2';

copy part from 's3://awssampledbuswest2/ssbgz/part' 
credentials 'aws_access_key_id=<Your-Access-Key-ID>;aws_secret_access_key=<Your-Secret-Access-Key>'
gzip region 'us-west-2';

copy supplier from 's3://awssampledbuswest2/ssbgz/supplier' 
credentials 'aws_access_key_id=<Your-Access-Key-ID>;aws_secret_access_key=<Your-Secret-Access-Key>'
gzip region 'us-west-2';
select stv_tbl_perm.name as "table", count(*) as "blocks (mb)"
from stv_blocklist, stv_tbl_perm
where stv_blocklist.tbl = stv_tbl_perm.id
and stv_blocklist.slice = stv_tbl_perm.slice
and stv_tbl_perm.name in ('customer', 'part', 'supplier', 'dwdate', 'lineorder')
group by stv_tbl_perm.name
order by 1 asc;
table          blocks(mb)
customer  604
dwdate     160
lineorder   26,359
part           200
supplier    236

���̤������ƥ֥��å��������ʤ��ʤäƤ���

select trim(name) as table, slice, sum(num_values) as rows, min(minvalue), max(maxvalue)
from svv_diskusage
where name in ('customer', 'part', 'supplier', 'dwdate', 'lineorder') 
and col =0
group by name, slice
order by name, slice;
-- Query 1
-- Restrictions on only one dimension. 
select sum(lo_extendedprice*lo_discount) as revenue
from lineorder, dwdate
where lo_orderdate = d_datekey
and d_year = 1997 
and lo_discount between 1 and 3 
and lo_quantity < 24;

-- Query 2
-- Restrictions on two dimensions 

select sum(lo_revenue), d_year, p_brand1
from lineorder, dwdate, part, supplier
where lo_orderdate = d_datekey
and lo_partkey = p_partkey
and lo_suppkey = s_suppkey
and p_category = 'MFGR#12'
and s_region = 'AMERICA'
group by d_year, p_brand1
order by d_year, p_brand1;

-- Query 3
-- Drill down in time to just one month 

select c_city, s_city, d_year, sum(lo_revenue) as revenue 
from customer, lineorder, supplier, dwdate
where lo_custkey = c_custkey
and lo_suppkey = s_suppkey
and lo_orderdate = d_datekey
and (c_city='UNITED KI1' or
c_city='UNITED KI5')
and (s_city='UNITED KI1' or
s_city='UNITED KI5')
and d_yearmonth = 'Dec1997'
group by c_city, s_city, d_year
order by d_year asc, revenue desc;

f:id:yohei-a:20170416212122p:image

����