I’ve been exploring different mechanisms to post Python Jupyter notebook files on WordPress. Of course, I can use nbconvert
to convert my notebook files to other formats – including HTML – right from the command line. I can then post this file as part of an embedded HTML block in a WordPress post. However, this sounded like an unnecessary step, since I also wanted the notebook to be available in GitHub. I did not want to deal with generating this HTML file AND also managing a published notebook on GitLab as well. Smells a lot like duplicate efforts, wasted time. Thanks to a great WordPress plugin from Andy Challis, called nbconvert, I was able to achieve what I wanted! See his page at https://www.andrewchallis.co.uk/portfolio/php-nbconvert-a-wordpress-plugin-for-jupyter-notebooks/ for complete instructions.
- If you haven’t yet, you must install WP Pusher as a plugin in your WordPress site. (See this for more info.)
- Go to his web page for nbconvert, copy the CSS custom code displayed on the page.
- Go to your WordPress page, and add the custom CSS displayed on the page above into Appearance -> Customize -> Additonal CSS
- Go to https://github.com/ghandic/nbconvert and verify the latest instructions. Install the nbconvert shortcode plugin through WP Pusher. Activate it.
- That’s it!
Follow the instructions to include your own Jupyter notebook file available on GitHub.
Example
Here is an example. In a standalone text (or paragraph) block, I included the following shortcode:
[nbconvert url="https://github.com/bkingcs/python_snippets/blob/master/clustering/hierarchical.ipynb" /]
This generates the following:
Hierarchical Clustering¶
Example code for heirarchical clustering
import numpy as np
import pandas as pd
import matplotlib as mpl
import matplotlib.pyplot as plt
import seaborn as sns; sns.set() # for plot styling
from scipy.spatial.distance import pdist, squareform
from scipy.cluster.hierarchy import linkage,fcluster,dendrogram, cophenet
from sklearn.cluster import AgglomerativeClustering
from sklearn.metrics.cluster import adjusted_rand_score, \
homogeneity_completeness_v_measure, contingency_matrix
from sklearn.datasets.samples_generator import make_blobs
X, y_true = make_blobs(n_samples=60, centers=5,
cluster_std=(0.3,0.4,0.5,0.7,0.7),
center_box=(0, 8), random_state=1234)
y_true = pd.Categorical([["A","B","C","D","E"][x] for x in y_true])
df = pd.DataFrame(data={"x" : X[:,0],
"y" : X[:,1],
"target": y_true })
X = df.iloc[:,0:2]
y_true = df.iloc[:,2]
sns.scatterplot(x="x",y="y",hue="target",data=df )
plt.show()
Let's make our first hierarchical clustering. We'll do it piecewise, using some functions in
the scipy.cluster.hierarchy
package.
We start by computing a distance matrix over all of our data:
d = pdist(X,metric="euclidean")
Now, let's perform the hierarchical clustering using single
linkage:
lnk = linkage(d,method="single")
Finally, let's plot a basic dendrogram using the dendrogram
function. Notice some of the
options we'll use to get some more informative results:
# Plot the dendrogram, but label the leafs using the actual labels in the data
plt.figure(figsize=(11,6))
plt.title("Hierarchical Clustering: Single Linkage")
plt.xlabel("sample index")
plt.ylabel("distance")
dnd = dendrogram(lnk,labels=list(y_true),leaf_rotation=0,leaf_font_size=9,
color_threshold=2)
plt.show()