3 minute read

Jupyter Labs/Notes

Installation

See Jupyter.org. Quick guide.

1
2
3
4
$ python3 -m venv venv-jupyter
$ pip install jupyterlab 
$ pip install notebook
$ pip install jupyterthemes

Remember to start jupyter notes from a venv to pick-up locally installed pkgs or modules being developed.

Tips

extend cells

Copy/Run following in first cell

1
2
from IPython.core.display import display, HTML
display(HTML("<style>.container { width:100% !important; }</style>"))

Pretty Display of Variables

1
2
from IPython.core.interactiveshell import InteractiveShell
InteractiveShell.ast_node_interactivity = "all"

Keyboard short-cuts

  • CMD+SHIFT+P Keyboard shortcut help
  • SHIFT+J , SHIFT+K Highlight up & Down (then copy,cut,paste)
  • SHIFT+M Merge highlighted cells

iPython Magic

Magic commands in ipython are also available in Jupyter

help

%lsmagic - List magic commands %magic_command? - Help on specific magic_command

%env

Set environment variables

1
2
3
4
# Running %env without any arguments
# lists all environment variables# The line below sets the environment
# variable
%env OMP_NUM_THREADS%env OMP_NUM_THREADS=4
1
env: OMP_NUM_THREADS=4

%run

Execute python code from external .py file AND other jupyter notebooks %run is not the same as importing a python module

1
2
3
# this will execute and show the output from
# all code cells of the specified notebook
%run ./two-histograms.ipynb

%load

Insert script from an external file;

1
2
# Before Running
%load ./hello_world.py
1
2
3
4
# After Running
# %load ./hello_world.py
if __name__ == "__main__":
 print("Hello World!")
1
Hello World!

%who

List all variables in global scope

1
2
3
4
5
6
one = "for the money"
two = "for the show"
three = "to get ready now go cat go"
%who str

one three two

Timing

Two useful commands for timinig %%time, %%timeit and %timeit

1
2
3
4
5
6
%%time
import time
for _ in range(1000):
 time.sleep(0.01) # sleep for 0.01 seconds

CPU times: user 21.5 ms, sys: 14.8 ms, total: 36.3 ms Wall time: 11.6 s

For %%timeit python uses timeit module , takes the average of 100K runs

1
2
3
4
import numpy
%timeit numpy.random.normal(size=100)

100000 loops, best of 3: 5.5 µs per loop

Line profiling %lprun

Outlines the time performance of a python function, program or script

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
! pip install line_profiler
%load_ext line_profiler

def remove_dups1(lst):
	uniques=[]
	for name in lst:
		if name not in uniques:
			uniques.append(name)
	return uniques

%lprun -f remove_dups1 remove_dups1(lst)

def remove_dups2(lst):
	return list(set(list))

%lprun -f remove_dups2 remove_dups2(lst)

Memory Profiling %mprun

Install mprun pakage

1
2
3
!pip install  memory_profiler

%load_ext memory_profiler

Save the function/script as a file

Using %%file my_file.py will save the contents of that jupyter cell as a file

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
%%file my_file.py
def calc_apply(df):
    column = df['COMB (mpg)']
    new_vals = column.apply(lambda x: x* 0.425)
    df['kml'] = new_vals
    return df

def calc_listcomp(df):
    column = df['COMB (mpg)']
    new_vals = [x*0.425 for x in column]
    df['kml'] = new_vals
    return df

def calc_direct(df):
    column = df['COMB (mpg)']
    new_vals = column*0.425
    df['kml'] = new_vals
    return df

def calc_numpy(df):
    column = df['COMB (mpg)'].values
    new_vals = column*0.425
    df['kml'] = pd.Series(new_vals)
    return df

Load the function/script

Next, load the memory profiler extension and import your functions from the file.

1
2
3
4
5
6
// from my_file import func_name
// %mprun -f func_name func_name(params) 
%load_ext memory_profiler

from my_file import calc_apply, calc_listcomp, 
                    calc_direct, calc_numpy

See original post on lprun,mprun

Advanced options ;-)

See