Skip to main content

Analysis Notebooks

Analysis notebooks offer a scratch space for analyzing data in Ganymede Cloud. A fresh notebook instantiation has templates to retrieve data and save notebooks. To access a fresh notebook instance, go to the

Flow Editor
page, click on the Analysis button in the header and select default. The image below shows an example of what this notebook would contain:

Ganymede Notebook

The first 2 cells in the image provide templates for validating and querying data, while the last cell enables the notebook to be saved to git.

info

To save the notebook to a different name, modify the dest key (i.e. - replace new_notebook with the desired notebook name) and execute the cell.

Installing Python packages

A list of available packages can be retrieved by running

!pip freeze --local

Additional packages can be installed using pip magic. For example, the following command installs a number of analytics and plotting packages:

!pip install scikit-learn seaborn matplotlib pandas_gbq

Loading data from Ganymede data lake

Tables in the environment can be accessed via the Ganymede SDK. For example, the code below lists available tables:

from ganymede_sdk import Ganymede

g = Ganymede()
tables = g.list_tables()
for table in tables:
print(table.table_id)

To run a query, pass an ANSI SQL snippet to the query_sql variable and use the retrieve_sql method from Ganymede's query module. This will return the results as a Pandas DataFrame. Below is an example of retrieving query results.

q = 'select * from ganymede_demo.demo_table'
from ganymede_sdk import Ganymede

g.retrieve_sql(query_sql)

Saving notebooks

The final cell contains a code that commits the notebook to the HEAD of the Github repository containing the stored Flow. The src entry in the files dictionary specifies the location of the notebook within the repository, and the dest entry specifies the name under which the notebook is committed.

from ganymede_sdk.notebook import save

files = [{'src': 'default', 'dest': 'notebook_to_save_to'}]
save(files)