Patch-clamp data analysis in Python: dataframes and statistics

In this blog, you can find tutorials for analyzing electrophysiological properties using Python and the software Clampfit (see full list here). In the end, you generally get tabular data with numerical results from which you can extract information. The following Jupyter notebooks show examples of how to explore, plot, and calculate stats from datasets in Python.

If you do not know how to clone the repository, you can simply copy the code as usual. Feel free to adapt the scripts to your needs. Email me if something is unclear or the code does not work as expected.

Contents


Example data

For these Jupyter notebooks, I use the brain cell database from the Allen Institute for Brain Science as an example dataset. You can also download the table directly from the Allen Brain Atlas website by clicking “Download Cell Feature Data.” This dataset contains electrophysiological and morphological data obtained from patch-clamp recordings in both mouse and human brains. These features were used to classify neurons into distinct subtypes (Figure 1), which is one of the main challenges in neuroscience because of the brain’s high cellular diversity.

Defining a cell type is not trivial (Zheng, 2022), and morphoelectrical features can be combined with transcriptomic data for a more integrative classification (Gouwens et al., 2020Scala et al., 2021). The goal is to identify cell types with similar attributes and functions to understand neural networks and behaviors. For example, although inhibitory interneurons (Tremblay et al., 2016) are fewer than excitatory neurons in the cortex (about 20% vs. 80%), there are at least four major classes of inhibitory neurons with distinct intrinsic and functional properties: Pvalb, Vip, Sst, and Lamp5 (Figure 1b).

Figure 1. Examples of excitatory (a) and inhibitory (b) neurons in the mouse visual cortex. Top panels: morphological reconstructions of dendrites. Bottom panels: electrophysiological responses from the same neurons to hyperpolarizing and depolarizing current injection. Source: Gouwens et al., 2019.

You can further explore the Allen Institute Cell Database using the interactive website or the Allen Software Development Kit (SDK). Through the Allen SDK, you can access all available electrophysiology measurements and morphological reconstructions.

Jupyter notebooks

Dataframes

This is a more general notebook that shows examples of how to explore tables with Pandas, plot data using Matplotlib and Seaborn, and do statistical tests using SciPy.

Honestly, I enjoy plotting and doing statistics more using GraphPad, since coding a similar plot in Python can be a painful experience. Unfortunately, GraphPad is a very expensive software. Fortunately, LLM applications can be of great help for fine-tuning plots in Python. R is also considered more user-friendly for statistical analysis, and it may be worth trying both, with the help of LLMs, to find the right tool for your analysis. Nevertheless, statistical analysis of patch-clamp data does not generally require complex models, and Python libraries such as SciPy and statsmodels do the job.

Estimation statistics

This notebook adapts code from the paper “Moving beyond P values: data analysis with estimation graphics” by Ho et al., 2019 (PDF). Ho et al. developed DABEST (‘data analysis with bootstrap-coupled estimation for calculating and plotting estimation statistics’). DABEST is an open-source library for Matlab, Python, and R, and is also available as a web application called estimationstats.

Ho and colleagues argue that estimation methods and plots can be a better (complementary, I would say) alternative to null hypothesis significance testing. In the case of plots, the figure below shows how to improve two-group data graphics from a bar plot (Figure 2a) to an estimation plot (Figure 2e).

Figure 2. Examples of how to plot two-group data. Source: Ho et al., 2019.

Superplots and nested analysis

This Jupyter notebook shows how to make SuperPlots in Python using Matplotlib and Seaborn, adapting the tutorial by Lord et al., 2020.

In neuroscience, we have to consider both sample-to-sample (or individual-to-individual) differences and cell-to-cell variability in our experiments. What is n? This depends on the population you want to compare. In general, it is the number of independent experiments, unless you are specifically interested in cell-to-cell variability, as in patch-clamp experiments. One approach to communicate variability is to use “superplots” to show both summary statistics and the individual experiments within the dataset (Figure 3).

Figure 3. A. Plots with multiple observations and small error bars. B. Better plots to show the reproducibility of each experiment. Lord et al., 2020.


To analyze the clustered or nested data, one conservative approach is to use only the mean of each replicate. However, a nested analysis (or multilevel analysis) accounts for both the variability between and within clusters. Clustered data arise when the data in a study can be grouped into several units or clusters, such as cells nested within animals (Figure 4b). The notebook also shows how to use statsmodels for a nested ANOVA.

Figure 4. Examples of crossed and nested experiments. Source: Krzywinski et al., 2015.

Resources

There are now many guidelines and tutorials focused on biology and neuroscience that help us to learn and improve our analyses. It is an ongoing process because journals have only started to standardize practices for reporting statistical information in the last few years (e.g., Nature journals).

Leave a comment