Through this series we see how to extract information from the Elastic MapReduce ruby client, and use it to create the same command using the AWS CLI tool. In this article, we will look specifically at running an interactive hive session.

Through this series we see how to extract information from the Elastic MapReduce ruby client, and use it to create the same command using the AWS CLI tool. In this article, we will look specifically at running a Hive script.

Elastic MapReduce is one of the services that the AWS CLI tool covers. However if you were to look at the help information, aws emr run-job-flow help, you will see that there are many parameters and there is not much in obvious ways of finding what the values should be.

Looking at the developer guide for Create a Job Flow and the API reference for RunJobFlow will help a bit. However there is one resource which can help immensely if we can get it to reveal its secrets, that resource is the Elastic MapReduce ruby client. See HowTo: Install AWS CLI - Amazon Elastic MapReduce Ruby Client if you need to install it.

There is a verbose option, -v, that you can pass that will show the URL, query string, and headers that are sent (however the headers will not be necessary for the AWS CLI tool). This series of articles will go through the examples from the developer guide for Create a Job Flow, but with the verbose option on and showing in turn the AWS CLI command.

Parts in this series

jq is a command line JSON processor; with it you can map, filter, slice, and transform JSON. We will use jq with JSON much like we would use sed, awk, and grep with text. Since the AWS CLI tool returns JSON, we can use jq to parse the data. This can actually make things easier than with the older AWS command line tools that return text; rather than using grep and awk to find the rows and columns we are interested in, we can query for the specific attributes in JSON that we are looking for.