Data Python Skillset

  • Creating 1D and 2D arrays
  • Plotting 1D arrays (lines)
  • Plotting 2D arrays (colour maps)
  • Creating arrays
  • Creating random itemised arrays
  • Creating patterned arrays
  • Loading tabular data from text files

Matplotlib CheatSheet

Matplotlib is used to plot graphs so it is often used alongside Numpy.

Always use:

import matplotlib.pyplot as plt

import numpy as np (assuming numpy is always used with matplotlib)

  • Plotting a 1D array:

plt.plot(xarray,yarray)

plt.show()

  • Plotting a 2D array:

Colour images:

image = np.array([[0,1,2],[3,4,5]])

plt.imshow(image, cmap = plt.cm.jet)      # creates the image based on the array image and  #cmap (colour map) jet

plt.colorbar()  #creates a legend bar

plt.show()  #displays the plot

  • Adding labels to the graphs:

plt.xlabel(“Insert label for x axis here”)

plt.ylabel(“Insert label for y axis here”)

http://matplotlib.org/users/pyplot_tutorial.html – Pyplot tutorial

Numpy CheatSheet

Notes:

The bolded bits are arbitrary. Shape means the number of items inside an array.

  • Importing module: import numpy as np

Basics

  • Creating a one-dimensional array: dataset = np.array([0,1,2,3])
  • Creating a two-dimensional array: dataset = np.array([[0,1,2],[3,4,5]])

Non-empty arrays

  • Creating an ordered sequence of 10 numbers inside an array: dataset = np.arange(10)
  • Creating an ordered sequence of evenly spaced numbers inside a predetermined interval: dataset = np.linspace(startnumber, endnumber, numberofmidpoints)
  • Creating an array of ones with a certain shape and dimensions: dataset = np.ones((2,3))
  • Creating an array of zeros with a certain shape and dimensions: dataset = np.zeros((2,3))
  • Creating an array with a user-given number of random values: dataset = np.random.rand(4)

Metadata

  • Checking the number of dimensions of an array: dataset.ndim
  • Checking the number of items inside an array:  dataset.shape

Statistics

Measures of central tendency

Starting with the following array: dataset = np.array([0,1,2,3,4])

  • Mean: print dataset.mean()
  • Median: print np.median(dataset)
  • SD: print np.std(dataset)

I/O

  • Opening a textfile: dataset = np.loadtxt(‘insert here path relative to the location of the program’) Note: Make sure the first row has an asterisk