Hive

Hive provides a SQL-like syntax that is translated into Map-Reduce jobs prior to submission to a cluster. You can think in SQL and have Hive handle the grunt work of translating your SQL commands into Map-Reduce jobs. For additional details on Hive please refer here.

A key aspect to note is that Hive jobs, like most Map-Reduce jobs are run in batch mode and take time to run based on the quantity of data. It is possible to cache the results of any run as data in HDFS or to alternatively export such results to other systems such as SQL Server, Oracle, MongoDB etc., for directly serving the application layer. As an aside, please note that newer processing systems implemented on top of Hadoop’s YARN system aim to speed up Hive. One such engine is Tez which is currently supported in Syncfusion big data platform.

Hive tab provides user friendly interface to manage and run Hive scripts at ease. It provides following features.

Interactively run Hive scripts

Hive scripts can be run interactively from within Big Data Studio by directly typing Hive queries into the provided console.

Execute complete Hive script.

You can execute complete Hive script file loaded in Editor by clicking the “Execute” button.

On executing script file using “Execute” button, the output is displayed in a separate “Result” tab either in grid or plain view based on the “Result View” selection under Hive Tab.

Logs generated during execution are displayed under Logs tab.

History of Hive jobs submitted by clicking “Execute” button are maintained separately and can be accessed through History tab.

CSV Export

You can export the results generated by running Hive script to CSV format by clicking “CSV Export” button in the ribbon.

Run All

You can run all commands in the script file loaded in Editor through interactive console one by one by clicking “Run All” button or by choosing the “Run in Console” option in context menu.

Run Selection

You can run selected commands in the script file through interactive console one by one by clicking “Run Selection” button or by choosing the “Run Selection in Console” option in context menu.

Autocomplete

Autocomplete feature is added in the Editor. It will provide suggestion for the keywords based on user typing and allows the user to accept the suggestion or select by pressing “down arrow” key.

Manage script files

You can create new script file and load a file using “Script” button.

You can save as a file using “Save As” button.

You have option to import scripts from folder, create new script and delete scripts present in the tree view.

NOTE

We ship several samples which you can use it for getting started.

Running Hive scripts in Tez mode:

You can run Hive scripts in Tez mode by switching the mode to “Tez” in the dropdown button in the ribbon menu.

Working with HBase

You can directly work with the data reside in HBase shipped with our SDK through Hive. We shipped samples to show case them as well.

Working with Hive Database

Big Data studio provides simple interface to create new database, table and an option to manage Hive databases with tree like explorer.

Create Database

Click “New Database” button in the Hive tab, enter database name in prompt and click “Create” button to create a new database.

Create Table

To create a new table, click “New Table” button in the Hive tab, select database and provide Hive query to create table based on the template provided in the editor area of the prompt box.

Hive Data Explorer

You can explore the databases and tables in Hive in simple tree like view under Databases tab.

Goto context menu by right clicking the database/ tables/ empty tree and explore the available options.

You can create database, table, view top 500 rows, drop database and table, alter table and add a new column with a simple interface.

Access Hive data from .NET applications

Data that is stored in Hive can be accessed through a Syncfusion provided .NET API (Syncfusion.ThriftHive.Base). This API provides user friendly access to Hive data from within the .NET environment. Several samples are shipped with the Big Data Studio product. You can navigate to these from the Syncfusion Big Data dashboard as shown below.

Hive C# samples

Select any platform from the dashboard under “Connect to Hive from C#” to view the corresponding sample browser. Below screenshot shows the WPF sample browser.

Please note that these samples always connect to the local instance of Syncfusion Big Data. You will have change the server in code to connect to a remote Syncfusion Big Data server.

For more information about the Thrift library (Syncfusion.ThriftHive.Base), refer the documentation link available here.