HowTo: AWS CLI Elastic MapReduce - Interactive Hive
Through this series we see how to extract information from the Elastic MapReduce ruby client, and use it to create the same command using the AWS CLI tool. In this article, we will look specifically at running an interactive hive session.
Credentials
~/.aws/credentials.json
1
2
3
4
5
6
7
8
{
"access_id": "C99F5C7EE00F1EXAMPLE",
"private_key": "a63xWEj9ZFbigxqA7wI3Nuwj3mte3RDBdEXAMPLE",
"keypair": "my-key",
"key-pair-file": "~/.ssh/my-key.pem",
"log_uri": "s3n://my-bucket/hadoop/",
"region": "us-east-1"
}
Start cluster
Elastic MapReduce Ruby Client
Console - user@hostname ~ $
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
elastic-mapreduce -v \
--create \
--name "Interactive Hive" \
--alive \
--instance-group MASTER \
--bid-price 0.06 \
--instance-count 1 \
--instance-type m1.small \
--instance-group CORE \
--bid-price 0.06 \
--instance-count 2 \
--instance-type m1.small \
--instance-group TASK \
--bid-price 0.06 \
--instance-count 2 \
--instance-type m1.small \
--hive-interactive \
--visible-to-all-users \
-c ~/.aws/credentials.json
Output
1
2
3
4
5
6
7
Requesting URL:
https://us-east-1.elasticmapreduce.amazonaws.com/
Query string:
Instances.KeepJobFlowAliveWhenNoSteps=true&LogUri=s3n%3A%2F%2Fmy-bucket%2Fhadoop%2F&Steps.member.1.HadoopJarStep.Args.member.5=--hive-versions&Steps.member.1.HadoopJarStep.Args.member.4=--install-hive&Instances.Ec2KeyName=my-key&Instances.InstanceGroups.member.1.InstanceRole=MASTER&Name=Interactive%20Hive&Instances.InstanceGroups.member.2.InstanceType=m1.small&Steps.member.1.HadoopJarStep.Args.member.3=s3%3A%2F%2Fus-east-1.elasticmapreduce%2Flibs%2Fhive%2F&Steps.member.1.HadoopJarStep.Jar=s3%3A%2F%2Fus-east-1.elasticmapreduce%2Flibs%2Fscript-runner%2Fscript-runner.jar&Instances.InstanceGroups.member.1.Market=SPOT&Timestamp=2013-04-16T05%3A21%3A03%2B00%3A00&Instances.InstanceGroups.member.1.BidPrice=0.06&Instances.InstanceGroups.member.2.Market=SPOT&VisibleToAllUsers=true&SignatureVersion=2&AWSAccessKeyId=C99F5C7EE00F1EXAMPLE&Instances.InstanceGroups.member.3.InstanceRole=TASK&Instances.InstanceGroups.member.2.InstanceRole=CORE&Instances.TerminationProtected=false&Instances.InstanceGroups.member.1.InstanceCount=1&Steps.member.1.ActionOnFailure=TERMINATE_JOB_FLOW&Instances.InstanceGroups.member.3.InstanceType=m1.small&Instances.InstanceGroups.member.3.InstanceCount=2&Steps.member.1.Name=Setup%20Hive&Instances.InstanceGroups.member.3.BidPrice=0.06&Instances.InstanceGroups.member.3.Market=SPOT&Instances.InstanceGroups.member.1.InstanceType=m1.small&ContentType=JSON&Steps.member.1.HadoopJarStep.Args.member.2=--base-path&Signature=gDujEissx3TYMAGGGMM7vJX%2Bfu%2FYzxZnAOIA5ogKm34%3D&Instances.InstanceGroups.member.2.InstanceCount=2&Instances.InstanceGroups.member.3.Name=Task%20Instance%20Group&Action=RunJobFlow&Instances.InstanceGroups.member.2.BidPrice=0.06&Steps.member.1.HadoopJarStep.Args.member.1=s3%3A%2F%2Fus-east-1.elasticmapreduce%2Flibs%2Fhive%2Fhive-script&Instances.InstanceGroups.member.1.Name=Master%20Instance%20Group&Steps.member.1.HadoopJarStep.Args.member.6=latest&AmiVersion=latest&SignatureMethod=HmacSHA256&Instances.InstanceGroups.member.2.Name=Core%20Instance%20Group
Headers:
x-amzn-RequestIdb3b3a806-213e-49de-a3f1-a705d687f67fHostus-east-1.elasticmapreduce.amazonaws.com:443User-Agentruby-client
Created job flow j-A94KBLK016YJA
Formatted Output
Output - Requesting URL
1
https://us-east-1.elasticmapreduce.amazonaws.com/
Output - Parameters
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
AWSAccessKeyId=C99F5C7EE00F1EXAMPLE
Action=RunJobFlow
AmiVersion=latest
ContentType=JSON
Instances.Ec2KeyName=my-key
Instances.InstanceGroups.member.1.BidPrice=0.06
Instances.InstanceGroups.member.1.InstanceCount=1
Instances.InstanceGroups.member.1.InstanceRole=MASTER
Instances.InstanceGroups.member.1.InstanceType=m1.small
Instances.InstanceGroups.member.1.Market=SPOT
Instances.InstanceGroups.member.1.Name=Master Instance Group
Instances.InstanceGroups.member.2.BidPrice=0.06
Instances.InstanceGroups.member.2.InstanceCount=2
Instances.InstanceGroups.member.2.InstanceRole=CORE
Instances.InstanceGroups.member.2.InstanceType=m1.small
Instances.InstanceGroups.member.2.Market=SPOT
Instances.InstanceGroups.member.2.Name=Core Instance Group
Instances.InstanceGroups.member.3.BidPrice=0.06
Instances.InstanceGroups.member.3.InstanceCount=2
Instances.InstanceGroups.member.3.InstanceRole=TASK
Instances.InstanceGroups.member.3.InstanceType=m1.small
Instances.InstanceGroups.member.3.Market=SPOT
Instances.InstanceGroups.member.3.Name=Task Instance Group
Instances.KeepJobFlowAliveWhenNoSteps=true
Instances.TerminationProtected=false
LogUri=s3n://my-bucket/hadoop/
Name=Interactive Hive
Signature=gDujEissx3TYMAGGGMM7vJX+fu/YzxZnAOIA5ogKm34=
SignatureMethod=HmacSHA256
SignatureVersion=2
Steps.member.1.ActionOnFailure=TERMINATE_JOB_FLOW
Steps.member.1.HadoopJarStep.Args.member.1=s3://us-east-1.elasticmapreduce/libs/hive/hive-script
Steps.member.1.HadoopJarStep.Args.member.2=--base-path
Steps.member.1.HadoopJarStep.Args.member.3=s3://us-east-1.elasticmapreduce/libs/hive/
Steps.member.1.HadoopJarStep.Args.member.4=--install-hive
Steps.member.1.HadoopJarStep.Args.member.5=--hive-versions
Steps.member.1.HadoopJarStep.Args.member.6=latest
Steps.member.1.HadoopJarStep.Jar=s3://us-east-1.elasticmapreduce/libs/script-runner/script-runner.jar
Steps.member.1.Name=Setup Hive
Timestamp=2013-04-16T05:21:03+00:00
VisibleToAllUsers=true
Output - Headers
1
2
3
Host: us-east-1.elasticmapreduce.amazonaws.com:443
User-Agent: ruby-client
x-amzn-RequestId: b3b3a806-213e-49de-a3f1-a705d687f67f
Output - Non-verbose output
1
Created job flow j-A94KBLK016YJA
API Request
Example API Request
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
https://us-east-1.elasticmapreduce.amazonaws.com/
?Action=RunJobFlow
&Name=Interactive Hive
&Instances.Ec2KeyName=my-key
&Instances.InstanceGroups.member.1.Name=Master Instance Group
&Instances.InstanceGroups.member.1.InstanceRole=MASTER
&Instances.InstanceGroups.member.1.InstanceType=m1.small
&Instances.InstanceGroups.member.1.InstanceCount=1
&Instances.InstanceGroups.member.1.Market=SPOT
&Instances.InstanceGroups.member.1.BidPrice=0.06
&Instances.InstanceGroups.member.2.Name=Core Instance Group
&Instances.InstanceGroups.member.2.InstanceRole=CORE
&Instances.InstanceGroups.member.2.InstanceType=m1.small
&Instances.InstanceGroups.member.2.InstanceCount=2
&Instances.InstanceGroups.member.2.Market=SPOT
&Instances.InstanceGroups.member.2.BidPrice=0.06
&Instances.InstanceGroups.member.3.Name=Task Instance Group
&Instances.InstanceGroups.member.3.InstanceRole=TASK
&Instances.InstanceGroups.member.3.InstanceType=m1.small
&Instances.InstanceGroups.member.3.InstanceCount=2
&Instances.InstanceGroups.member.3.Market=SPOT
&Instances.InstanceGroups.member.3.BidPrice=0.06
&Instances.KeepJobFlowAliveWhenNoSteps=true
&Instances.TerminationProtected=false
&Steps.member.1.Name=Setup Hive
&Steps.member.1.ActionOnFailure=TERMINATE_JOB_FLOW
&Steps.member.1.HadoopJarStep.Jar=s3://us-east-1.elasticmapreduce/libs/script-runner/script-runner.jar
&Steps.member.1.HadoopJarStep.Args.member.1=s3://us-east-1.elasticmapreduce/libs/hive/hive-script
&Steps.member.1.HadoopJarStep.Args.member.2=--base-path
&Steps.member.1.HadoopJarStep.Args.member.3=s3://us-east-1.elasticmapreduce/libs/hive/
&Steps.member.1.HadoopJarStep.Args.member.4=--install-hive
&Steps.member.1.HadoopJarStep.Args.member.5=--hive-versions
&Steps.member.1.HadoopJarStep.Args.member.6=latest
&LogUri=s3n://my-bucket/hadoop/
&AmiVersion=latest
&VisibleToAllUsers=true
&*AUTHPARAMS*
AWS CLI
Console - user@hostname ~ $
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
aws --region us-east-1 emr \
run-job-flow \
--name "Interactive Hive" \
--instances "{
\"ec_2_key_name\": \"my-key\",
\"instance_groups\": [
{
\"name\": \"Master Instance Group\",
\"instance_role\": \"MASTER\",
\"instance_type\": \"m1.small\",
\"instance_count\": 1,
\"market\": \"SPOT\",
\"bid_price\": \"0.06\"
},
{
\"name\": \"Core Instance Group\",
\"instance_role\": \"CORE\",
\"instance_type\": \"m1.small\",
\"instance_count\": 2,
\"market\": \"SPOT\",
\"bid_price\": \"0.06\"
},
{
\"name\": \"Task Instance Group\",
\"instance_role\": \"TASK\",
\"instance_type\": \"m1.small\",
\"instance_count\": 2,
\"market\": \"SPOT\",
\"bid_price\": \"0.06\"
}
],
\"keep_job_flow_alive_when_no_steps\": true,
\"termination_protected\": false
}" \
--steps "[
{
\"name\": \"Setup Hive\",
\"action_on_failure\": \"TERMINATE_JOB_FLOW\",
\"hadoop_jar_step\": {
\"jar\": \"s3://us-east-1.elasticmapreduce/libs/script-runner/script-runner.jar\",
\"args\": [
\"s3://us-east-1.elasticmapreduce/libs/hive/hive-script\",
\"--base-path\",
\"s3://us-east-1.elasticmapreduce/libs/hive/\",
\"--install-hive\",
\"--hive-versions\",
\"latest\"
]
}
}
]" \
--log-uri "s3n://my-bucket/hadoop/" \
--ami-version "latest" \
--visible-to-all-users
Output
1
2
3
4
5
6
{
"ResponseMetadata": {
"RequestId": "9a10b614-a65a-11e2-ba8d-d59ce4b37f90"
},
"JobFlowId": "j-Y9CC7P8SFNAU"
}
Describe cluster
Elastic MapReduce Ruby Client
Console - user@hostname ~ $
1
elastic-mapreduce --describe j-Y9CC7P8SFNAU
API Request
Example API Request
1
2
3
4
https://us-east-1.elasticmapreduce.amazonaws.com/
?Action=DescribeJobFlows
&JobFlowIds.memeber.1=j-Y9CC7P8SFNAU
&*AUTHPARAMS*
AWS CLI
Console - user@hostname ~ $
1
2
3
aws --region us-east-1 emr \
describe-job-flows \
--job-flow-ids "[\"j-Y9CC7P8SFNAU\"]"
Output
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
{
"JobFlows": [
{
"Name": "Interactive Hive",
"BootstrapActions": [],
"Instances": {
"InstanceCount": 5,
"Placement": {
"AvailabilityZone": "us-east-1e"
},
"MasterPublicDnsName": "ec2-204-236-247-160.compute-1.amazonaws.com",
"NormalizedInstanceHours": 0,
"MasterInstanceId": "i-ac9ca5cf",
"InstanceGroups": [
{
"ReadyDateTime": "2013-04-16T06:10:17Z",
"InstanceType": "m1.small",
"InstanceRole": "MASTER",
"InstanceRunningCount": 1,
"State": "RUNNING",
"BidPrice": "0.06",
"Market": "SPOT",
"StartDateTime": "2013-04-16T06:08:42Z",
"InstanceGroupId": "ig-2Z6XY7WSAQKOL",
"CreationDateTime": "2013-04-16T05:58:03Z",
"InstanceRequestCount": 1,
"LastStateChangeReason": "",
"Name": "Master Instance Group"
},
{
"ReadyDateTime": "2013-04-16T06:10:24Z",
"InstanceType": "m1.small",
"InstanceRole": "CORE",
"InstanceRunningCount": 2,
"State": "RUNNING",
"BidPrice": "0.06",
"Market": "SPOT",
"StartDateTime": "2013-04-16T06:10:24Z",
"InstanceGroupId": "ig-3UPDGFCSJLXNW",
"CreationDateTime": "2013-04-16T05:58:03Z",
"InstanceRequestCount": 2,
"LastStateChangeReason": "",
"Name": "Core Instance Group"
},
{
"ReadyDateTime": "2013-04-16T06:14:55Z",
"InstanceType": "m1.small",
"InstanceRole": "TASK",
"InstanceRunningCount": 2,
"State": "RUNNING",
"BidPrice": "0.06",
"Market": "SPOT",
"StartDateTime": "2013-04-16T06:14:55Z",
"InstanceGroupId": "ig-YQX07WNO910N",
"CreationDateTime": "2013-04-16T05:58:03Z",
"InstanceRequestCount": 2,
"LastStateChangeReason": "Resizing complete",
"Name": "Task Instance Group"
}
],
"MasterInstanceType": "m1.small",
"TerminationProtected": false,
"HadoopVersion": "1.0.3",
"KeepJobFlowAliveWhenNoSteps": true,
"SlaveInstanceType": "m1.small",
"Ec2KeyName": "my-key"
},
"Steps": [
{
"ExecutionStatusDetail": {
"State": "COMPLETED",
"EndDateTime": "2013-04-16T06:11:35Z",
"CreationDateTime": "2013-04-16T05:58:03Z",
"StartDateTime": "2013-04-16T06:10:23Z"
},
"StepConfig": {
"HadoopJarStep": {
"Args": [
"s3://us-east-1.elasticmapreduce/libs/hive/hive-script",
"--base-path",
"s3://us-east-1.elasticmapreduce/libs/hive/",
"--install-hive",
"--hive-versions",
"latest"
],
"Jar": "s3://us-east-1.elasticmapreduce/libs/script-runner/script-runner.jar",
"Properties": []
},
"Name": "Setup Hive",
"ActionOnFailure": "TERMINATE_JOB_FLOW"
}
}
],
"ExecutionStatusDetail": {
"State": "WAITING",
"ReadyDateTime": "2013-04-16T06:10:24Z",
"CreationDateTime": "2013-04-16T05:58:03Z",
"StartDateTime": "2013-04-16T06:10:24Z",
"LastStateChangeReason": "Waiting after step completed"
},
"VisibleToAllUsers": true,
"JobFlowId": "j-Y9CC7P8SFNAU",
"LogUri": "s3n://my-bucket/hadoop/",
"AmiVersion": "2.3.3",
"SupportedProducts": []
}
],
"ResponseMetadata": {
"RequestId": "71ec785e-a660-11e2-a5bf-b1f408d32c54"
}
}
Connect to Master
Wait until the execution state is WAITING
Console - user@hostname ~ $
1
2
3
4
aws --region us-east-1 emr \
describe-job-flows \
--job-flow-ids "[\"j-Y9CC7P8SFNAU\"]" \
| jq -r '.JobFlows[0].ExecutionStatusDetail.State'
Output
1
WAITING
Get the master public DNS name
Console - user@hostname ~ $
1
2
3
4
aws --region us-east-1 emr \
describe-job-flows \
--job-flow-ids "[\"j-Y9CC7P8SFNAU\"]" \
| jq -r '.JobFlows[0].Instances.MasterPublicDnsName'
Output
1
ec2-204-236-247-160.compute-1.amazonaws.com
SSH to the master using the SSH key specified when starting the cluster and with the username hadoop
.
Console - user@hostname ~ $
1
ssh -i ~/.ssh/my-key.pem hadoop@ec2-204-236-247-160.compute-1.amazonaws.com
Run hive
on the master for our interactive session.
Console - hadoop@master ~ $
1
hive
Terminate Cluster
Elastic MapReduce Ruby Client
Console - user@hostname ~ $
1
elastic-mapreduce --terminate j-Y9CC7P8SFNAU
API Request
Example API Request
1
2
3
4
https://us-east-1.elasticmapreduce.amazonaws.com/
?Action=TerminateJobFlows
&JobFlowIds.member.1=j-Y9CC7P8SFNAU
&*AUTHPARAMS*
AWS CLI
Console - user@hostname ~ $
1
2
3
aws --region us-east-1 emr \
terminate-job-flows \
--job-flow-ids "[\"j-Y9CC7P8SFNAU\"]"
Output
1
2
3
4
5
{
"ResponseMetadata": {
"RequestId": "b4dcda43-a660-11e2-830b-5523a2fd5603"
}
}
Parts in this series
- HowTo: AWS CLI Elastic MapReduce
- HowTo: AWS CLI Elastic MapReduce - Hive Script
- HowTo: AWS CLI Elastic MapReduce - Interactive Hive
- HowTo: AWS CLI Elastic MapReduce - Pig Script
- HowTo: AWS CLI Elastic MapReduce - Interactive Pig
- HowTo: AWS CLI Elastic MapReduce - Streaming Job Flow
- HowTo: AWS CLI Elastic MapReduce - Cascading Job Flow
- HowTo: AWS CLI Elastic MapReduce - Custom JAR Job Flow
- HowTo: AWS CLI Elastic MapReduce - HBase