Jekyll2023-04-12T07:33:51-07:00https://bendichter.com/feed.xmlBen DichterNeuro-data scientist passionate about open dataBen Dichterben.dichter@catalystneuro.comGit Timesheet2022-07-18T00:00:00-07:002022-07-18T00:00:00-07:00https://bendichter.com/posts/git-timesheet<p>I recently faced a situation where I needed to assess the amount of work
done by each member of a team on a project that has spanned over a year.
That project has a git repo, and I could see when each person made a commit.
I decided to break it down by weeks. Whenever a person submitted any commit
to the repo on any branch, I counted them as working on the project for that week.
Of course this is imperfect- someone could work a lot and make no commits for
that week and someone could have submitted a commit but might have worked very
little. Still, this seems like the most fair way to assess work I could think of.</p>
<p>The code will work on any locally cloned git repo. <code class="language-plaintext highlighter-rouge">skip</code> allows you to remove
contributors, and is ideal for handling bots. <code class="language-plaintext highlighter-rouge">author_map</code> allows you to tranform
handles. This is ideal if members of your team make some contributions through PRs
from a clone of the repo and some of their PRs through GitHub directly, or if
they have multiple usernames.</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kn">import</span> <span class="nn">os</span>
<span class="kn">import</span> <span class="nn">pandas</span> <span class="k">as</span> <span class="n">pd</span>
<span class="kn">import</span> <span class="nn">numpy</span> <span class="k">as</span> <span class="n">np</span>
<span class="kn">import</span> <span class="nn">matplotlib.pyplot</span> <span class="k">as</span> <span class="n">plt</span>
<span class="kn">import</span> <span class="nn">tqdm</span>
<span class="kn">import</span> <span class="nn">datetime</span>
<span class="kn">import</span> <span class="nn">matplotlib</span>
<span class="k">def</span> <span class="nf">git_timesheet</span><span class="p">(</span><span class="n">git_dir</span><span class="p">,</span> <span class="n">skip</span><span class="o">=</span><span class="bp">None</span><span class="p">,</span> <span class="n">author_map</span><span class="o">=</span><span class="bp">None</span><span class="p">):</span>
<span class="k">if</span> <span class="n">skip</span> <span class="ow">is</span> <span class="bp">None</span><span class="p">:</span>
<span class="n">skip</span> <span class="o">=</span> <span class="p">[</span>
<span class="s">"dependabot[bot]"</span><span class="p">,</span>
<span class="s">"!git for-each-ref --format='%(refname:short)' `git symbolic-ref HEAD`"</span><span class="p">,</span>
<span class="p">]</span>
<span class="k">if</span> <span class="n">author_map</span> <span class="ow">is</span> <span class="bp">None</span><span class="p">:</span>
<span class="n">author_map</span> <span class="o">=</span> <span class="nb">dict</span><span class="p">()</span>
<span class="n">os</span><span class="p">.</span><span class="n">system</span><span class="p">(</span><span class="sa">f</span><span class="s">"git --git-dir </span><span class="si">{</span><span class="n">git_dir</span><span class="si">}</span><span class="s">/.git log --all --numstat --pretty=format:'--%h--%ad--%aN' --no-renames > git.log"</span><span class="p">)</span>
<span class="n">commits</span> <span class="o">=</span> <span class="n">pd</span><span class="p">.</span><span class="n">read_csv</span><span class="p">(</span><span class="s">"git.log"</span><span class="p">,</span> <span class="n">sep</span><span class="o">=</span><span class="s">"</span><span class="se">\u0012</span><span class="s">"</span><span class="p">,</span> <span class="n">header</span><span class="o">=</span><span class="bp">None</span><span class="p">,</span> <span class="n">names</span><span class="o">=</span><span class="p">[</span><span class="s">'raw'</span><span class="p">])</span>
<span class="n">commit_marker</span> <span class="o">=</span> <span class="n">commits</span><span class="p">[</span><span class="n">commits</span><span class="p">[</span><span class="s">'raw'</span><span class="p">].</span><span class="nb">str</span><span class="p">.</span><span class="n">startswith</span><span class="p">(</span><span class="s">"--"</span><span class="p">,</span><span class="n">na</span><span class="o">=</span><span class="bp">False</span><span class="p">)]</span>
<span class="n">commit_info</span> <span class="o">=</span> <span class="n">commit_marker</span><span class="p">[</span><span class="s">'raw'</span><span class="p">].</span><span class="nb">str</span><span class="p">.</span><span class="n">extract</span><span class="p">(</span><span class="sa">r</span><span class="s">"^--(?P<sha>.*?)--(?P<date>.*?)--(?P<author>.*?)$"</span><span class="p">,</span> <span class="n">expand</span><span class="o">=</span><span class="bp">True</span><span class="p">)</span>
<span class="n">commit_info</span><span class="p">[</span><span class="s">'date'</span><span class="p">]</span> <span class="o">=</span> <span class="n">pd</span><span class="p">.</span><span class="n">to_datetime</span><span class="p">(</span><span class="n">commit_info</span><span class="p">[</span><span class="s">'date'</span><span class="p">])</span>
<span class="n">file_stats_marker</span> <span class="o">=</span> <span class="n">commits</span><span class="p">[</span><span class="o">~</span><span class="n">commits</span><span class="p">.</span><span class="n">index</span><span class="p">.</span><span class="n">isin</span><span class="p">(</span><span class="n">commit_info</span><span class="p">.</span><span class="n">index</span><span class="p">)]</span>
<span class="n">file_stats</span> <span class="o">=</span> <span class="n">file_stats_marker</span><span class="p">[</span><span class="s">'raw'</span><span class="p">].</span><span class="nb">str</span><span class="p">.</span><span class="n">split</span><span class="p">(</span><span class="s">"</span><span class="se">\t</span><span class="s">"</span><span class="p">,</span> <span class="n">expand</span><span class="o">=</span><span class="bp">True</span><span class="p">)</span>
<span class="n">file_stats</span> <span class="o">=</span> <span class="n">file_stats</span><span class="p">.</span><span class="n">rename</span><span class="p">(</span><span class="n">columns</span><span class="o">=</span><span class="p">{</span><span class="mi">0</span><span class="p">:</span> <span class="s">"insertions"</span><span class="p">,</span> <span class="mi">1</span><span class="p">:</span> <span class="s">"deletions"</span><span class="p">,</span> <span class="mi">2</span><span class="p">:</span> <span class="s">"filename"</span><span class="p">})</span>
<span class="n">file_stats</span><span class="p">[</span><span class="s">'insertions'</span><span class="p">]</span> <span class="o">=</span> <span class="n">pd</span><span class="p">.</span><span class="n">to_numeric</span><span class="p">(</span><span class="n">file_stats</span><span class="p">[</span><span class="s">'insertions'</span><span class="p">],</span> <span class="n">errors</span><span class="o">=</span><span class="s">'coerce'</span><span class="p">)</span>
<span class="n">file_stats</span><span class="p">[</span><span class="s">'deletions'</span><span class="p">]</span> <span class="o">=</span> <span class="n">pd</span><span class="p">.</span><span class="n">to_numeric</span><span class="p">(</span><span class="n">file_stats</span><span class="p">[</span><span class="s">'deletions'</span><span class="p">],</span> <span class="n">errors</span><span class="o">=</span><span class="s">'coerce'</span><span class="p">)</span>
<span class="n">commit_data</span> <span class="o">=</span> <span class="n">commit_info</span><span class="p">.</span><span class="n">reindex</span><span class="p">(</span><span class="n">commits</span><span class="p">.</span><span class="n">index</span><span class="p">).</span><span class="n">fillna</span><span class="p">(</span><span class="n">method</span><span class="o">=</span><span class="s">"ffill"</span><span class="p">)</span>
<span class="n">commit_data</span> <span class="o">=</span> <span class="n">commit_data</span><span class="p">[</span><span class="o">~</span><span class="n">commit_data</span><span class="p">.</span><span class="n">index</span><span class="p">.</span><span class="n">isin</span><span class="p">(</span><span class="n">commit_info</span><span class="p">.</span><span class="n">index</span><span class="p">)]</span>
<span class="n">commit_data</span> <span class="o">=</span> <span class="n">commit_data</span><span class="p">.</span><span class="n">join</span><span class="p">(</span><span class="n">file_stats</span><span class="p">)</span>
<span class="c1"># get total authors and weeks
</span> <span class="n">all_authors</span> <span class="o">=</span> <span class="n">commit_data</span><span class="p">[</span><span class="s">"author"</span><span class="p">].</span><span class="n">unique</span><span class="p">()</span>
<span class="n">all_authors</span> <span class="o">=</span> <span class="nb">list</span><span class="p">(</span><span class="n">np</span><span class="p">.</span><span class="n">unique</span><span class="p">([</span><span class="n">author_map</span><span class="p">.</span><span class="n">get</span><span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">x</span><span class="p">)</span> <span class="k">for</span> <span class="n">x</span> <span class="ow">in</span> <span class="n">all_authors</span> <span class="k">if</span> <span class="n">x</span> <span class="ow">not</span> <span class="ow">in</span> <span class="n">skip</span><span class="p">]))</span>
<span class="n">dates</span> <span class="o">=</span> <span class="n">commit_data</span><span class="p">[</span><span class="s">"date"</span><span class="p">]</span>
<span class="n">start</span> <span class="o">=</span> <span class="n">dates</span><span class="p">.</span><span class="nb">min</span><span class="p">()</span>
<span class="n">stop</span> <span class="o">=</span> <span class="n">dates</span><span class="p">.</span><span class="nb">max</span><span class="p">()</span>
<span class="n">n_weeks</span> <span class="o">=</span> <span class="p">(</span><span class="n">stop</span><span class="o">-</span><span class="n">start</span><span class="p">).</span><span class="n">days</span> <span class="o">//</span> <span class="mi">7</span>
<span class="n">timesheet</span> <span class="o">=</span> <span class="n">np</span><span class="p">.</span><span class="n">zeros</span><span class="p">((</span><span class="nb">len</span><span class="p">(</span><span class="n">all_authors</span><span class="p">),</span> <span class="n">n_weeks</span><span class="p">))</span>
<span class="c1"># iterate over commits and timesheet per week
</span> <span class="k">for</span> <span class="n">week_n</span> <span class="ow">in</span> <span class="n">tqdm</span><span class="p">.</span><span class="n">trange</span><span class="p">(</span><span class="n">n_weeks</span><span class="p">):</span>
<span class="n">week_start</span> <span class="o">=</span> <span class="n">start</span> <span class="o">+</span> <span class="n">datetime</span><span class="p">.</span><span class="n">timedelta</span><span class="p">(</span><span class="mi">7</span> <span class="o">*</span> <span class="p">(</span><span class="n">week_n</span><span class="o">-</span><span class="mi">1</span><span class="p">))</span>
<span class="n">week_stop</span> <span class="o">=</span> <span class="n">start</span> <span class="o">+</span> <span class="n">datetime</span><span class="p">.</span><span class="n">timedelta</span><span class="p">(</span><span class="mi">7</span> <span class="o">*</span> <span class="n">week_n</span><span class="p">)</span>
<span class="n">commit_data_for_week</span> <span class="o">=</span> <span class="n">commit_data</span><span class="p">[(</span><span class="n">week_start</span> <span class="o"><</span> <span class="n">commit_data</span><span class="p">[</span><span class="s">"date"</span><span class="p">])</span> <span class="o">&</span> <span class="p">(</span><span class="n">commit_data</span><span class="p">[</span><span class="s">"date"</span><span class="p">]</span> <span class="o"><</span> <span class="n">week_stop</span><span class="p">)]</span>
<span class="n">authors_for_week</span> <span class="o">=</span> <span class="n">commit_data_for_week</span><span class="p">[</span><span class="s">"author"</span><span class="p">].</span><span class="n">unique</span><span class="p">()</span>
<span class="c1"># handle different usernames
</span> <span class="n">authors_for_week</span> <span class="o">=</span> <span class="nb">list</span><span class="p">(</span><span class="n">np</span><span class="p">.</span><span class="n">unique</span><span class="p">([</span><span class="n">author_map</span><span class="p">.</span><span class="n">get</span><span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">x</span><span class="p">)</span> <span class="k">for</span> <span class="n">x</span> <span class="ow">in</span> <span class="n">authors_for_week</span><span class="p">]))</span>
<span class="k">for</span> <span class="n">i</span><span class="p">,</span> <span class="n">author</span> <span class="ow">in</span> <span class="nb">enumerate</span><span class="p">(</span><span class="n">all_authors</span><span class="p">):</span>
<span class="k">if</span> <span class="n">author</span> <span class="ow">in</span> <span class="n">authors_for_week</span><span class="p">:</span>
<span class="n">timesheet</span><span class="p">[</span><span class="n">i</span><span class="p">,</span> <span class="n">week_n</span><span class="p">]</span> <span class="o">=</span> <span class="mi">1</span>
<span class="n">fig</span><span class="p">,</span> <span class="n">ax</span> <span class="o">=</span> <span class="n">plt</span><span class="p">.</span><span class="n">subplots</span><span class="p">(</span><span class="n">figsize</span><span class="o">=</span><span class="p">(</span><span class="mi">30</span><span class="p">,</span> <span class="mi">10</span><span class="p">))</span>
<span class="n">ax</span><span class="p">.</span><span class="n">imshow</span><span class="p">(</span><span class="n">timesheet</span><span class="p">,</span> <span class="n">cmap</span><span class="o">=</span><span class="s">"Greys"</span><span class="p">)</span>
<span class="n">ax</span><span class="p">.</span><span class="n">set_yticks</span><span class="p">(</span><span class="nb">range</span><span class="p">(</span><span class="nb">len</span><span class="p">(</span><span class="n">all_authors</span><span class="p">)))</span>
<span class="n">_</span> <span class="o">=</span> <span class="n">ax</span><span class="p">.</span><span class="n">set_yticklabels</span><span class="p">(</span><span class="n">all_authors</span><span class="p">)</span>
<span class="n">ax</span><span class="p">.</span><span class="n">set_xlabel</span><span class="p">(</span><span class="s">"weeks"</span><span class="p">)</span>
<span class="n">plt</span><span class="p">.</span><span class="n">minorticks_on</span><span class="p">()</span>
<span class="n">plt</span><span class="p">.</span><span class="n">gca</span><span class="p">().</span><span class="n">xaxis</span><span class="p">.</span><span class="n">set_minor_locator</span><span class="p">(</span><span class="n">matplotlib</span><span class="p">.</span><span class="n">ticker</span><span class="p">.</span><span class="n">MultipleLocator</span><span class="p">(</span><span class="mi">1</span><span class="p">))</span>
<span class="n">plt</span><span class="p">.</span><span class="n">gca</span><span class="p">().</span><span class="n">yaxis</span><span class="p">.</span><span class="n">set_minor_locator</span><span class="p">(</span><span class="n">matplotlib</span><span class="p">.</span><span class="n">ticker</span><span class="p">.</span><span class="n">MultipleLocator</span><span class="p">(</span><span class="mi">1</span><span class="p">))</span>
<span class="n">plt</span><span class="p">.</span><span class="n">grid</span><span class="p">(</span><span class="n">which</span><span class="o">=</span><span class="s">"both"</span><span class="p">,</span> <span class="n">linewidth</span><span class="o">=</span><span class="mf">0.25</span><span class="p">,</span> <span class="n">color</span><span class="o">=</span><span class="s">"k"</span><span class="p">)</span>
</code></pre></div></div>
<p>I developed a function for parsing the git log and creating a visualization
of weeks worked by each member. The repo I used this for is private, so I will
demonstrate it on a separate repo from CatalystNeuro that is public.</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">git_timesheet</span><span class="p">(</span>
<span class="s">"path/to/nwb-conversion-tools"</span><span class="p">,</span>
<span class="n">author_map</span><span class="o">=</span><span class="p">{</span>
<span class="s">"bendichter"</span><span class="p">:</span> <span class="s">"Ben Dichter"</span><span class="p">,</span>
<span class="s">"luiz"</span><span class="p">:</span> <span class="s">"Luiz Tauffer"</span><span class="p">,</span>
<span class="s">"luiztauffer"</span><span class="p">:</span> <span class="s">"Luiz Tauffer"</span><span class="p">,</span>
<span class="s">"CodyCBakerPhD"</span><span class="p">:</span> <span class="s">"Cody Baker"</span><span class="p">,</span>
<span class="s">"h-mayorquin"</span><span class="p">:</span> <span class="s">"Heberto Mayorquin"</span><span class="p">,</span>
<span class="s">"sbuergers"</span><span class="p">:</span> <span class="s">"Steffen Bürgers"</span><span class="p">,</span>
<span class="s">"weiglszonja"</span><span class="p">:</span> <span class="s">"Szonja Weigl"</span><span class="p">,</span>
<span class="p">}</span>
<span class="p">)</span>
</code></pre></div></div>
<p><img src="../../images/git-timesheet.png" alt="git-timesheet" /></p>Ben Dichterben.dichter@catalystneuro.comI recently faced a situation where I needed to assess the amount of work done by each member of a team on a project that has spanned over a year. That project has a git repo, and I could see when each person made a commit. I decided to break it down by weeks. Whenever a person submitted any commit to the repo on any branch, I counted them as working on the project for that week. Of course this is imperfect- someone could work a lot and make no commits for that week and someone could have submitted a commit but might have worked very little. Still, this seems like the most fair way to assess work I could think of.Grouped Bar Plot2022-07-18T00:00:00-07:002022-07-18T00:00:00-07:00https://bendichter.com/posts/grouped_bar<p>I often have a need to plot a grouped bar plot. Matplotlib provides
<a href="https://matplotlib.org/stable/gallery/lines_bars_and_markers/barchart.html">this example</a>,
which is helpful, but not quite generalizable enough for my needs, as it only
shows how to group 2 categories together. Here is a generalization of that
tutorial that was very helpful for me and I hope is helpful for others as well.</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kn">import</span> <span class="nn">matplotlib.pyplot</span> <span class="k">as</span> <span class="n">plt</span>
<span class="kn">import</span> <span class="nn">numpy</span> <span class="k">as</span> <span class="n">np</span>
<span class="kn">from</span> <span class="nn">typing</span> <span class="kn">import</span> <span class="n">List</span><span class="p">,</span> <span class="n">Optional</span>
<span class="k">def</span> <span class="nf">grouped_barplot</span><span class="p">(</span>
<span class="n">data</span><span class="p">,</span>
<span class="n">clabels</span><span class="p">:</span> <span class="n">List</span><span class="p">[</span><span class="nb">str</span><span class="p">],</span>
<span class="n">xlabels</span><span class="p">:</span> <span class="n">List</span><span class="p">[</span><span class="nb">str</span><span class="p">],</span>
<span class="n">gap</span><span class="p">:</span> <span class="nb">float</span> <span class="o">=</span> <span class="mf">0.3</span><span class="p">,</span>
<span class="n">show_legend</span><span class="p">:</span> <span class="nb">bool</span> <span class="o">=</span> <span class="bp">True</span><span class="p">,</span>
<span class="n">show_bar_labels</span><span class="p">:</span> <span class="nb">bool</span> <span class="o">=</span> <span class="bp">True</span><span class="p">,</span>
<span class="n">ax</span><span class="p">:</span> <span class="n">Optional</span><span class="p">[</span><span class="n">plt</span><span class="p">.</span><span class="n">Axes</span><span class="p">]</span> <span class="o">=</span> <span class="bp">None</span><span class="p">,</span>
<span class="p">):</span>
<span class="s">"""
Parameters
----------
data: array-like
size=(len(clabels), len(xlabels))
clabels list(str):
xlabels: list(str)
gap: float
Gap between categories
show_legend: bool
Show legend. Default = True
show_bar_labels: bool
Show data values above each bar. Default = True
ax: plt.Axes
If not provided, a new figure will be created.
Returns
-------
ax, all_rects
"""</span>
<span class="k">if</span> <span class="n">ax</span> <span class="ow">is</span> <span class="bp">None</span><span class="p">:</span>
<span class="n">_</span><span class="p">,</span> <span class="n">ax</span> <span class="o">=</span> <span class="n">plt</span><span class="p">.</span><span class="n">subplots</span><span class="p">()</span>
<span class="n">x</span> <span class="o">=</span> <span class="n">np</span><span class="p">.</span><span class="n">arange</span><span class="p">(</span><span class="nb">len</span><span class="p">(</span><span class="n">xlabels</span><span class="p">))</span> <span class="c1"># the label locations
</span> <span class="n">width</span> <span class="o">=</span> <span class="p">(</span><span class="mi">1</span> <span class="o">-</span> <span class="n">gap</span><span class="p">)</span> <span class="o">/</span> <span class="nb">len</span><span class="p">(</span><span class="n">clabels</span><span class="p">)</span> <span class="c1"># the width of the bars
</span>
<span class="n">all_rects</span> <span class="o">=</span> <span class="p">[]</span>
<span class="k">for</span> <span class="n">i</span><span class="p">,</span> <span class="p">(</span><span class="n">cdata</span><span class="p">,</span> <span class="n">clabel</span><span class="p">)</span> <span class="ow">in</span> <span class="nb">enumerate</span><span class="p">(</span><span class="nb">zip</span><span class="p">(</span><span class="n">data</span><span class="p">,</span> <span class="n">clabels</span><span class="p">)):</span>
<span class="n">rects</span> <span class="o">=</span> <span class="n">ax</span><span class="p">.</span><span class="n">bar</span><span class="p">(</span><span class="n">x</span> <span class="o">-</span> <span class="p">.</span><span class="mi">5</span> <span class="o">+</span> <span class="n">gap</span> <span class="o">/</span> <span class="mi">2</span> <span class="o">+</span> <span class="n">i</span> <span class="o">*</span> <span class="n">width</span><span class="p">,</span> <span class="n">cdata</span><span class="p">,</span> <span class="n">width</span><span class="p">,</span> <span class="n">label</span><span class="o">=</span><span class="n">clabel</span><span class="p">)</span>
<span class="k">if</span> <span class="n">show_bar_labels</span><span class="p">:</span>
<span class="n">ax</span><span class="p">.</span><span class="n">bar_label</span><span class="p">(</span><span class="n">rects</span><span class="p">,</span> <span class="n">padding</span><span class="o">=</span><span class="mi">3</span><span class="p">)</span>
<span class="n">all_rects</span><span class="p">.</span><span class="n">append</span><span class="p">(</span><span class="n">rects</span><span class="p">)</span>
<span class="c1"># Add some text for labels, title and custom x-axis tick labels, etc.
</span> <span class="n">ax</span><span class="p">.</span><span class="n">set_xticks</span><span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">xlabels</span><span class="p">)</span>
<span class="k">if</span> <span class="n">show_legend</span><span class="p">:</span>
<span class="n">ax</span><span class="p">.</span><span class="n">legend</span><span class="p">()</span>
<span class="k">return</span> <span class="n">ax</span><span class="p">,</span> <span class="n">all_rects</span>
</code></pre></div></div>
<p>Example usage:</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">grouped_barplot</span><span class="p">(</span>
<span class="n">data</span><span class="o">=</span><span class="p">[[</span><span class="mi">1</span><span class="p">,</span><span class="mi">2</span><span class="p">,</span><span class="mi">3</span><span class="p">,</span><span class="mi">4</span><span class="p">],</span> <span class="p">[</span><span class="mi">2</span><span class="p">,</span><span class="mi">3</span><span class="p">,</span><span class="mi">4</span><span class="p">,</span><span class="mi">5</span><span class="p">],</span> <span class="p">[</span><span class="mi">4</span><span class="p">,</span><span class="mi">5</span><span class="p">,</span><span class="mi">6</span><span class="p">,</span><span class="mi">7</span><span class="p">]],</span>
<span class="n">clabels</span><span class="o">=</span><span class="p">[</span><span class="s">"there"</span><span class="p">,</span> <span class="s">"are"</span><span class="p">,</span> <span class="s">"categories"</span><span class="p">],</span>
<span class="n">xlabels</span><span class="o">=</span><span class="p">[</span><span class="s">"x"</span><span class="p">,</span> <span class="s">"labels"</span><span class="p">,</span> <span class="s">"go"</span><span class="p">,</span> <span class="s">"here"</span><span class="p">],</span>
<span class="p">)</span>
</code></pre></div></div>
<p><img src="../../images/grouped_barplot.png" alt="grouped_barplot" /></p>Ben Dichterben.dichter@catalystneuro.comI often have a need to plot a grouped bar plot. Matplotlib provides this example, which is helpful, but not quite generalizable enough for my needs, as it only shows how to group 2 categories together. Here is a generalization of that tutorial that was very helpful for me and I hope is helpful for others as well.Stacked Step Plot2022-07-18T00:00:00-07:002022-07-18T00:00:00-07:00https://bendichter.com/posts/stacked-step-plot<p>Matplotlib provides a
<a href="https://matplotlib.org/stable/api/_as_gen/matplotlib.pyplot.stackplot.html">stackplot</a>,
which stacks area, and a <a href="https://matplotlib.org/stable/api/_as_gen/matplotlib.pyplot.step.html">step plot</a>,
which provides steps, but there is no stacked step plot. This is useful
when you have some accumulating resource that have different types that
begin and end all at once. I personally had a need for this when I wanted
to visualize my man-month commitment to my company’s funded projects over
the next 7 years.</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kn">from</span> <span class="nn">itertools</span> <span class="kn">import</span> <span class="n">chain</span>
<span class="kn">import</span> <span class="nn">numpy</span> <span class="k">as</span> <span class="n">np</span>
<span class="kn">import</span> <span class="nn">matplotlib.pyplot</span> <span class="k">as</span> <span class="n">plt</span>
<span class="k">def</span> <span class="nf">plot_stacked_step</span><span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">y</span><span class="p">,</span> <span class="n">names</span><span class="p">,</span> <span class="n">ax</span><span class="o">=</span><span class="bp">None</span><span class="p">,</span> <span class="n">sort</span><span class="o">=</span><span class="bp">True</span><span class="p">):</span>
<span class="n">x_in</span> <span class="o">=</span> <span class="n">x</span>
<span class="n">y_in</span> <span class="o">=</span> <span class="n">y</span>
<span class="k">if</span> <span class="n">sort</span><span class="p">:</span>
<span class="n">idx</span> <span class="o">=</span> <span class="n">np</span><span class="p">.</span><span class="n">argsort</span><span class="p">([</span><span class="n">x</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="k">for</span> <span class="n">x</span> <span class="ow">in</span> <span class="n">x_in</span><span class="p">])</span>
<span class="n">x_in</span> <span class="o">=</span> <span class="n">np</span><span class="p">.</span><span class="n">asarray</span><span class="p">(</span><span class="n">x_in</span><span class="p">)[</span><span class="n">idx</span><span class="p">]</span>
<span class="n">y_in</span> <span class="o">=</span> <span class="n">np</span><span class="p">.</span><span class="n">asarray</span><span class="p">(</span><span class="n">y_in</span><span class="p">)[</span><span class="n">idx</span><span class="p">]</span>
<span class="k">if</span> <span class="n">ax</span> <span class="ow">is</span> <span class="bp">None</span><span class="p">:</span>
<span class="n">fig</span><span class="p">,</span> <span class="n">ax</span> <span class="o">=</span> <span class="n">plt</span><span class="p">.</span><span class="n">subplots</span><span class="p">(</span><span class="n">figsize</span><span class="o">=</span><span class="p">(</span><span class="mi">7</span><span class="p">,</span> <span class="mi">3</span><span class="p">))</span>
<span class="n">all_x</span> <span class="o">=</span> <span class="nb">sorted</span><span class="p">(</span><span class="nb">set</span><span class="p">(</span><span class="n">chain</span><span class="p">(</span><span class="o">*</span><span class="n">x_in</span><span class="p">)))</span>
<span class="n">all_adj_y</span> <span class="o">=</span> <span class="p">[]</span>
<span class="k">for</span> <span class="n">xs</span><span class="p">,</span> <span class="n">ys</span> <span class="ow">in</span> <span class="nb">zip</span><span class="p">(</span><span class="n">x_in</span><span class="p">,</span> <span class="n">y_in</span><span class="p">):</span>
<span class="n">new_y</span> <span class="o">=</span> <span class="mi">0</span>
<span class="n">iy</span> <span class="o">=</span> <span class="mi">0</span>
<span class="n">adj_y</span> <span class="o">=</span> <span class="p">[]</span>
<span class="k">for</span> <span class="n">x</span> <span class="ow">in</span> <span class="n">all_x</span><span class="p">:</span>
<span class="k">if</span> <span class="n">x</span> <span class="ow">in</span> <span class="n">xs</span><span class="p">:</span>
<span class="n">new_y</span> <span class="o">=</span> <span class="n">ys</span><span class="p">[</span><span class="n">iy</span><span class="p">]</span> <span class="k">if</span> <span class="n">iy</span> <span class="o"><</span> <span class="nb">len</span><span class="p">(</span><span class="n">ys</span><span class="p">)</span> <span class="k">else</span> <span class="mi">0</span>
<span class="n">adj_y</span><span class="p">.</span><span class="n">append</span><span class="p">(</span><span class="n">new_y</span><span class="p">)</span>
<span class="n">iy</span> <span class="o">+=</span> <span class="mi">1</span>
<span class="k">else</span><span class="p">:</span>
<span class="n">adj_y</span><span class="p">.</span><span class="n">append</span><span class="p">(</span><span class="n">new_y</span><span class="p">)</span>
<span class="n">all_adj_y</span><span class="p">.</span><span class="n">append</span><span class="p">(</span><span class="n">adj_y</span><span class="p">)</span>
<span class="n">stacked</span> <span class="o">=</span> <span class="n">np</span><span class="p">.</span><span class="n">cumsum</span><span class="p">(</span><span class="n">all_adj_y</span><span class="p">,</span> <span class="n">axis</span><span class="o">=</span><span class="mi">0</span><span class="p">)</span>
<span class="k">for</span> <span class="n">name</span><span class="p">,</span> <span class="n">i_stacked</span> <span class="ow">in</span> <span class="nb">list</span><span class="p">(</span><span class="nb">zip</span><span class="p">(</span><span class="n">names</span><span class="p">,</span> <span class="n">stacked</span><span class="p">))[::</span><span class="o">-</span><span class="mi">1</span><span class="p">]:</span>
<span class="n">ax</span><span class="p">.</span><span class="n">fill</span><span class="p">(</span>
<span class="n">np</span><span class="p">.</span><span class="n">repeat</span><span class="p">(</span><span class="n">all_x</span><span class="p">,</span> <span class="mi">2</span><span class="p">),</span>
<span class="n">np</span><span class="p">.</span><span class="n">hstack</span><span class="p">((</span><span class="mi">0</span><span class="p">,</span> <span class="n">np</span><span class="p">.</span><span class="n">repeat</span><span class="p">(</span><span class="n">i_stacked</span><span class="p">,</span> <span class="mi">2</span><span class="p">)))[:</span><span class="o">-</span><span class="mi">1</span><span class="p">],</span>
<span class="n">label</span><span class="o">=</span><span class="n">name</span><span class="p">,</span>
<span class="p">)</span>
<span class="k">return</span> <span class="n">ax</span>
</code></pre></div></div>
<p>Example usage (fake data)</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kn">from</span> <span class="nn">datetime</span> <span class="kn">import</span> <span class="n">datetime</span>
<span class="n">data</span> <span class="o">=</span> <span class="p">[</span>
<span class="p">{</span>
<span class="s">"name"</span><span class="p">:</span> <span class="s">"Project1"</span><span class="p">,</span>
<span class="s">"start-stop"</span><span class="p">:</span> <span class="p">[</span><span class="n">datetime</span><span class="p">(</span><span class="mi">2021</span><span class="p">,</span><span class="mi">4</span><span class="p">,</span><span class="mi">1</span><span class="p">),</span> <span class="n">datetime</span><span class="p">(</span><span class="mi">2026</span><span class="p">,</span> <span class="mi">2</span><span class="p">,</span> <span class="mi">28</span><span class="p">)],</span>
<span class="s">"man-months"</span><span class="p">:</span> <span class="mi">4</span><span class="p">,</span>
<span class="p">},</span>
<span class="p">{</span>
<span class="s">"name"</span><span class="p">:</span> <span class="s">"Project2"</span><span class="p">,</span>
<span class="s">"start-stop"</span><span class="p">:</span> <span class="p">[</span><span class="n">datetime</span><span class="p">(</span><span class="mi">2022</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">1</span><span class="p">),</span> <span class="n">datetime</span><span class="p">(</span><span class="mi">2022</span><span class="p">,</span><span class="mi">12</span><span class="p">,</span><span class="mi">31</span><span class="p">)],</span>
<span class="s">"man-months"</span><span class="p">:</span> <span class="mf">0.5</span><span class="p">,</span>
<span class="p">},</span>
<span class="p">{</span>
<span class="s">"name"</span><span class="p">:</span> <span class="s">"Project3"</span><span class="p">,</span>
<span class="s">"start-stop"</span><span class="p">:</span> <span class="p">[</span><span class="n">datetime</span><span class="p">(</span><span class="mi">2019</span><span class="p">,</span> <span class="mi">8</span><span class="p">,</span> <span class="mi">1</span><span class="p">),</span> <span class="n">datetime</span><span class="p">(</span><span class="mi">2024</span><span class="p">,</span> <span class="mi">4</span><span class="p">,</span> <span class="mi">30</span><span class="p">)],</span>
<span class="s">"man-months"</span><span class="p">:</span> <span class="mf">0.5</span><span class="p">,</span>
<span class="p">},</span>
<span class="p">{</span>
<span class="s">"name"</span><span class="p">:</span> <span class="s">"Proejct4"</span><span class="p">,</span>
<span class="s">"start-stop"</span><span class="p">:</span> <span class="p">[</span><span class="n">datetime</span><span class="p">(</span><span class="mi">2022</span><span class="p">,</span> <span class="mi">4</span><span class="p">,</span> <span class="mi">15</span><span class="p">),</span> <span class="n">datetime</span><span class="p">(</span><span class="mi">2023</span><span class="p">,</span> <span class="mi">4</span><span class="p">,</span> <span class="mi">14</span><span class="p">)],</span>
<span class="s">"man-months"</span><span class="p">:</span> <span class="mf">2.</span><span class="p">,</span>
<span class="p">},</span>
<span class="p">{</span>
<span class="s">"name"</span><span class="p">:</span> <span class="s">"Project5"</span><span class="p">,</span>
<span class="s">"start-stop"</span><span class="p">:</span> <span class="p">[</span><span class="n">datetime</span><span class="p">(</span><span class="mi">2021</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">1</span><span class="p">),</span> <span class="n">datetime</span><span class="p">(</span><span class="mi">2024</span><span class="p">,</span> <span class="mi">12</span><span class="p">,</span> <span class="mi">31</span><span class="p">)],</span>
<span class="s">"man-months"</span><span class="p">:</span> <span class="mi">1</span><span class="p">,</span>
<span class="p">},</span>
<span class="p">{</span>
<span class="s">"name"</span><span class="p">:</span> <span class="s">"Projecy6"</span><span class="p">,</span>
<span class="s">"start-stop"</span><span class="p">:</span> <span class="p">[</span><span class="n">datetime</span><span class="p">(</span><span class="mi">2021</span><span class="p">,</span> <span class="mi">8</span><span class="p">,</span> <span class="mi">1</span><span class="p">),</span> <span class="n">datetime</span><span class="p">(</span><span class="mi">2023</span><span class="p">,</span> <span class="mi">7</span><span class="p">,</span> <span class="mi">31</span><span class="p">)],</span>
<span class="s">"man-months"</span><span class="p">:</span> <span class="mf">0.5</span>
<span class="p">},</span>
<span class="p">{</span>
<span class="s">"name"</span><span class="p">:</span> <span class="s">"Project7"</span><span class="p">,</span>
<span class="s">"start-stop"</span><span class="p">:</span> <span class="p">[</span><span class="n">datetime</span><span class="p">(</span><span class="mi">2022</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">10</span><span class="p">),</span> <span class="n">datetime</span><span class="p">(</span><span class="mi">2023</span><span class="p">,</span> <span class="mi">9</span><span class="p">,</span> <span class="mi">30</span><span class="p">)],</span>
<span class="s">"man-months"</span><span class="p">:</span> <span class="mf">0.58</span><span class="p">,</span>
<span class="p">},</span>
<span class="p">]</span>
<span class="n">ax</span> <span class="o">=</span> <span class="n">plot_stacked_step</span><span class="p">(</span>
<span class="p">[</span><span class="n">x</span><span class="p">[</span><span class="s">"start-stop"</span><span class="p">]</span> <span class="k">for</span> <span class="n">x</span> <span class="ow">in</span> <span class="n">data</span><span class="p">],</span>
<span class="p">[[</span><span class="n">x</span><span class="p">[</span><span class="s">"man-months"</span><span class="p">]]</span> <span class="k">for</span> <span class="n">x</span> <span class="ow">in</span> <span class="n">data</span><span class="p">],</span>
<span class="p">[</span><span class="n">x</span><span class="p">[</span><span class="s">"name"</span><span class="p">]</span> <span class="k">for</span> <span class="n">x</span> <span class="ow">in</span> <span class="n">data</span><span class="p">],</span>
<span class="p">)</span>
<span class="n">ax</span><span class="p">.</span><span class="n">set_xlabel</span><span class="p">(</span><span class="s">"time"</span><span class="p">)</span>
<span class="n">ax</span><span class="p">.</span><span class="n">set_ylabel</span><span class="p">(</span><span class="s">"man-months commited"</span><span class="p">)</span>
<span class="n">ax</span><span class="p">.</span><span class="n">legend</span><span class="p">(</span><span class="n">bbox_to_anchor</span><span class="o">=</span><span class="p">(</span><span class="mi">1</span><span class="p">,</span> <span class="mi">1</span><span class="p">))</span>
<span class="n">ax</span><span class="p">.</span><span class="n">axhline</span><span class="p">(</span><span class="mi">12</span><span class="p">,</span> <span class="n">ls</span><span class="o">=</span><span class="s">'--'</span><span class="p">,</span> <span class="n">color</span><span class="o">=</span><span class="s">'k'</span><span class="p">)</span>
<span class="n">ax</span><span class="p">.</span><span class="n">axvline</span><span class="p">(</span><span class="n">datetime</span><span class="p">.</span><span class="n">now</span><span class="p">(),</span> <span class="n">ls</span><span class="o">=</span><span class="s">'--'</span><span class="p">,</span> <span class="n">color</span><span class="o">=</span><span class="p">[.</span><span class="mi">5</span><span class="p">,</span> <span class="p">.</span><span class="mi">5</span><span class="p">,</span> <span class="p">.</span><span class="mi">5</span><span class="p">])</span>
<span class="n">ax</span><span class="p">.</span><span class="n">set_ylim</span><span class="p">((</span><span class="mi">0</span><span class="p">,</span> <span class="mf">12.5</span><span class="p">))</span>
</code></pre></div></div>
<p><img src="../../images/stacked_step_plot.png" alt="stacked step plot" /></p>Ben Dichterben.dichter@catalystneuro.comMatplotlib provides a stackplot, which stacks area, and a step plot, which provides steps, but there is no stacked step plot. This is useful when you have some accumulating resource that have different types that begin and end all at once. I personally had a need for this when I wanted to visualize my man-month commitment to my company’s funded projects over the next 7 years.Spiral Plot2022-06-17T00:00:00-07:002022-06-17T00:00:00-07:00https://bendichter.com/posts/spiral-plot<p>Let’s use for example Google Trends results for the search term “gifts.”
Google offers this plot:</p>
<p><img src="../../images/google_trends_gifts.png" alt="gifts-google-trends-plot" /></p>
<p>It should be no surprise that these results show a cyclical trend. It looks
like this might be an annual cycle with the max around
Christmas time. It can be hard to create visualizations that bring out
this cyclic pattern. Stacking years on top of each other will require
you to break the year at a certain point, breaking continuous data and
potentially creating the impression of two different spikes when there
is really just one.</p>
<p>I have created way to plot cyclic that I call a “spiral plot.”
The data starts at the center of a circle and proceeds out in a spiral.
Each year of time forms a ring around the spiral so that a given angle
of the circle has data from the same time of year on every loop. Here
is the google trend for “gifts” shown as a spiral plot:</p>
<p><img src="../../images/spiral_plot_no_donut.png" alt="gifts spiral plot with no donut" /></p>
<p>This plot is more compact than the line version and may highlight some trends
more clearly. The drawback of this approach is that earlier years are smaller
than more recent years. You can make this less dramatic by giving the circle an
empty center (setting <code class="language-plaintext highlighter-rouge">origin=-2</code>).</p>
<p><img src="../../images/spiral_plot_w_donut.png" alt="gifts spiral plot" /></p>
<p>Code:</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kn">from</span> <span class="nn">typing</span> <span class="kn">import</span> <span class="n">Optional</span>
<span class="kn">import</span> <span class="nn">matplotlib.pyplot</span> <span class="k">as</span> <span class="n">plt</span>
<span class="kn">import</span> <span class="nn">numpy</span> <span class="k">as</span> <span class="n">np</span>
<span class="kn">from</span> <span class="nn">matplotlib.collections</span> <span class="kn">import</span> <span class="n">PatchCollection</span>
<span class="kn">from</span> <span class="nn">matplotlib.patches</span> <span class="kn">import</span> <span class="n">Polygon</span>
<span class="k">def</span> <span class="nf">spiral_plot</span><span class="p">(</span>
<span class="n">data</span><span class="p">,</span>
<span class="n">num_cycles</span><span class="p">:</span> <span class="nb">int</span><span class="p">,</span>
<span class="n">num_points_per_seg</span><span class="p">:</span> <span class="nb">int</span> <span class="o">=</span> <span class="mi">100</span><span class="p">,</span>
<span class="n">angle</span><span class="p">:</span> <span class="nb">float</span> <span class="o">=</span> <span class="mf">0.</span><span class="p">,</span>
<span class="n">origin</span><span class="p">:</span> <span class="nb">float</span> <span class="o">=</span> <span class="mf">0.</span><span class="p">,</span>
<span class="n">cmap</span><span class="o">=</span><span class="bp">None</span><span class="p">,</span>
<span class="n">show_legend</span><span class="p">:</span> <span class="nb">bool</span> <span class="o">=</span> <span class="bp">True</span><span class="p">,</span>
<span class="n">ax</span><span class="p">:</span> <span class="n">Optional</span><span class="p">[</span><span class="n">plt</span><span class="p">.</span><span class="n">Axes</span><span class="p">]</span> <span class="o">=</span> <span class="bp">None</span>
<span class="p">):</span>
<span class="k">if</span> <span class="n">ax</span> <span class="ow">is</span> <span class="bp">None</span><span class="p">:</span>
<span class="n">_</span><span class="p">,</span> <span class="n">ax</span> <span class="o">=</span> <span class="n">plt</span><span class="p">.</span><span class="n">subplots</span><span class="p">(</span><span class="n">subplot_kw</span><span class="o">=</span><span class="p">{</span><span class="s">'projection'</span><span class="p">:</span> <span class="s">'polar'</span><span class="p">})</span>
<span class="n">n_segments</span> <span class="o">=</span> <span class="nb">len</span><span class="p">(</span><span class="n">data</span><span class="p">)</span>
<span class="n">num_points</span> <span class="o">=</span> <span class="n">num_points_per_seg</span> <span class="o">*</span> <span class="n">n_segments</span>
<span class="n">inner_rs</span> <span class="o">=</span> <span class="n">np</span><span class="p">.</span><span class="n">linspace</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="n">num_cycles</span><span class="p">,</span> <span class="n">num_points</span><span class="p">)</span>
<span class="n">outer_rs</span> <span class="o">=</span> <span class="n">inner_rs</span> <span class="o">+</span> <span class="mi">1</span>
<span class="n">thetas</span> <span class="o">=</span> <span class="n">np</span><span class="p">.</span><span class="n">linspace</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="mi">2</span><span class="o">*</span><span class="n">np</span><span class="p">.</span><span class="n">pi</span><span class="o">*</span><span class="n">num_cycles</span><span class="p">,</span> <span class="n">num_points</span><span class="p">)</span> <span class="o">+</span> <span class="n">angle</span>
<span class="n">patches</span> <span class="o">=</span> <span class="p">[]</span>
<span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="n">n_segments</span><span class="p">):</span>
<span class="n">tt</span> <span class="o">=</span> <span class="n">np</span><span class="p">.</span><span class="n">hstack</span><span class="p">(</span>
<span class="p">(</span>
<span class="n">thetas</span><span class="p">[</span><span class="n">i</span><span class="o">*</span><span class="n">num_points_per_seg</span><span class="p">:(</span><span class="n">i</span><span class="o">+</span><span class="mi">1</span><span class="p">)</span><span class="o">*</span><span class="n">num_points_per_seg</span><span class="p">],</span>
<span class="n">thetas</span><span class="p">[</span><span class="n">i</span><span class="o">*</span><span class="n">num_points_per_seg</span><span class="p">:(</span><span class="n">i</span><span class="o">+</span><span class="mi">1</span><span class="p">)</span><span class="o">*</span><span class="n">num_points_per_seg</span><span class="p">][::</span><span class="o">-</span><span class="mi">1</span><span class="p">]</span>
<span class="p">)</span>
<span class="p">)</span>
<span class="n">rr</span> <span class="o">=</span> <span class="n">np</span><span class="p">.</span><span class="n">hstack</span><span class="p">(</span>
<span class="p">(</span>
<span class="n">inner_rs</span><span class="p">[</span><span class="n">i</span><span class="o">*</span><span class="n">num_points_per_seg</span><span class="p">:(</span><span class="n">i</span><span class="o">+</span><span class="mi">1</span><span class="p">)</span><span class="o">*</span><span class="n">num_points_per_seg</span><span class="p">],</span>
<span class="n">outer_rs</span><span class="p">[</span><span class="n">i</span><span class="o">*</span><span class="n">num_points_per_seg</span><span class="p">:(</span><span class="n">i</span><span class="o">+</span><span class="mi">1</span><span class="p">)</span><span class="o">*</span><span class="n">num_points_per_seg</span><span class="p">][::</span><span class="o">-</span><span class="mi">1</span><span class="p">]</span>
<span class="p">)</span>
<span class="p">)</span>
<span class="n">patch</span> <span class="o">=</span> <span class="n">Polygon</span><span class="p">(</span><span class="n">np</span><span class="p">.</span><span class="n">c_</span><span class="p">[</span><span class="n">tt</span><span class="p">,</span> <span class="n">rr</span><span class="p">])</span>
<span class="n">patches</span><span class="p">.</span><span class="n">append</span><span class="p">(</span><span class="n">patch</span><span class="p">)</span>
<span class="n">patches</span> <span class="o">=</span> <span class="n">PatchCollection</span><span class="p">(</span><span class="n">patches</span><span class="p">,</span> <span class="n">cmap</span><span class="o">=</span><span class="n">cmap</span><span class="p">)</span>
<span class="n">patches</span><span class="p">.</span><span class="n">set_array</span><span class="p">(</span><span class="n">data</span><span class="p">)</span>
<span class="n">ax</span><span class="p">.</span><span class="n">add_collection</span><span class="p">(</span><span class="n">patches</span><span class="p">)</span>
<span class="n">ax</span><span class="p">.</span><span class="n">set_rlim</span><span class="p">((</span><span class="bp">None</span><span class="p">,</span> <span class="n">num_cycles</span><span class="o">+</span><span class="mi">1</span><span class="p">))</span>
<span class="n">ax</span><span class="p">.</span><span class="n">grid</span><span class="p">(</span><span class="bp">False</span><span class="p">)</span>
<span class="n">ax</span><span class="p">.</span><span class="n">set_rorigin</span><span class="p">(</span><span class="n">origin</span><span class="p">)</span>
<span class="k">if</span> <span class="n">show_legend</span><span class="p">:</span>
<span class="n">ax</span><span class="p">.</span><span class="n">figure</span><span class="p">.</span><span class="n">colorbar</span><span class="p">(</span><span class="n">patches</span><span class="p">,</span> <span class="n">shrink</span><span class="o">=</span><span class="mf">0.6</span><span class="p">)</span>
<span class="n">ax</span><span class="p">.</span><span class="n">spines</span><span class="p">.</span><span class="n">polar</span><span class="p">.</span><span class="n">set_visible</span><span class="p">(</span><span class="bp">False</span><span class="p">)</span>
<span class="n">ax</span><span class="p">.</span><span class="n">spines</span><span class="p">.</span><span class="n">inner</span><span class="p">.</span><span class="n">set_visible</span><span class="p">(</span><span class="bp">False</span><span class="p">)</span>
<span class="k">return</span> <span class="n">ax</span><span class="p">,</span> <span class="n">patches</span>
<span class="c1"># Example usage
</span><span class="kn">import</span> <span class="nn">pandas</span> <span class="k">as</span> <span class="n">pd</span>
<span class="c1"># data from any google trend
</span><span class="n">fpath</span> <span class="o">=</span> <span class="s">"multiTimeline.csv"</span>
<span class="n">trend</span> <span class="o">=</span> <span class="s">"gifts"</span>
<span class="n">data</span> <span class="o">=</span> <span class="n">pd</span><span class="p">.</span><span class="n">read_csv</span><span class="p">(</span><span class="n">fpath</span><span class="p">,</span> <span class="n">header</span><span class="o">=</span><span class="mi">1</span><span class="p">)[</span><span class="sa">f</span><span class="s">"</span><span class="si">{</span><span class="n">trend</span><span class="si">}</span><span class="s">: (United States)"</span><span class="p">].</span><span class="n">values</span>
<span class="n">ax</span><span class="p">,</span> <span class="n">patches</span> <span class="o">=</span> <span class="n">spiral_plot</span><span class="p">(</span><span class="n">data</span><span class="p">,</span> <span class="mi">5</span><span class="p">,</span> <span class="n">angle</span><span class="o">=</span><span class="mi">2</span><span class="o">*</span><span class="n">np</span><span class="p">.</span><span class="n">pi</span><span class="o">*</span><span class="mi">7</span><span class="o">/</span><span class="mi">12</span><span class="p">)</span>
<span class="c1"># make it prettier
</span><span class="n">ax</span><span class="p">.</span><span class="n">set_xticklabels</span><span class="p">([</span><span class="s">"Jan"</span><span class="p">,</span> <span class="s">"Feb"</span><span class="p">,</span> <span class="s">"Mar"</span><span class="p">,</span> <span class="s">"Apr"</span><span class="p">,</span> <span class="s">"May"</span><span class="p">,</span> <span class="s">"Jun"</span><span class="p">,</span> <span class="s">"Jul"</span><span class="p">,</span> <span class="s">"Aug"</span><span class="p">,</span> <span class="s">"Sep"</span><span class="p">,</span> <span class="s">"Oct"</span><span class="p">,</span> <span class="s">"Nov"</span><span class="p">,</span> <span class="s">"Dec"</span><span class="p">])</span>
<span class="n">ax</span><span class="p">.</span><span class="n">set_xticks</span><span class="p">([</span><span class="mi">2</span><span class="o">*</span><span class="n">np</span><span class="p">.</span><span class="n">pi</span><span class="o">*</span><span class="n">i</span><span class="o">/</span><span class="mi">12</span> <span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="mi">12</span><span class="p">)])</span>
<span class="n">ax</span><span class="p">.</span><span class="n">tick_params</span><span class="p">(</span><span class="n">axis</span><span class="o">=</span><span class="s">'x'</span><span class="p">,</span> <span class="n">which</span><span class="o">=</span><span class="s">'major'</span><span class="p">,</span> <span class="n">pad</span><span class="o">=-</span><span class="mi">5</span><span class="p">)</span>
<span class="n">ax</span><span class="p">.</span><span class="n">tick_params</span><span class="p">(</span><span class="n">axis</span><span class="o">=</span><span class="s">'y'</span><span class="p">,</span> <span class="n">colors</span><span class="o">=</span><span class="s">'white'</span><span class="p">)</span>
</code></pre></div></div>Ben Dichterben.dichter@catalystneuro.comLet’s use for example Google Trends results for the search term “gifts.” Google offers this plot:brokenaxes2020-07-12T00:00:00-07:002020-07-12T00:00:00-07:00https://bendichter.com/posts/brokenaxes<p><img width="200" src="https://raw.githubusercontent.com/bendichter/brokenaxes/master/broken_python_snake.png" title="broken python snake" alt="broken python snake" /></p>
<p>I created a Python package for creating broken axes plots like this one:</p>
<p><img width="400" src="https://raw.githubusercontent.com/bendichter/brokenaxes/master/example2.png" title="brokenaxes example" alt="brokenaxes example" /></p>
<p>You can create discontinuities along the x and/or y axis.
It also has compatibility for a number of other useful features like subplots and non-standard axes like log and datetime.
Check out the documentation with plenty of examples on the <a href="https://github.com/bendichter/brokenaxes">GitHub repo</a>.</p>Ben Dichterben.dichter@catalystneuro.comOSX Jupyter Launcher2018-04-23T00:00:00-07:002018-04-23T00:00:00-07:00https://bendichter.com/posts/osx-jupyter-launcher<p>If you use Jupyter on a regular basis, the steps to launch a notebook are probably second nature, but if you take a step back, it involves a lot of prior knowledge. A few times I’ve tried to bring brand new eager programmers into the glorious land of Python and Jupyter, but each time I found that the whole flow was really bogged down by this preamble that is pretty technical. I’ll give them an .ipynb file and then show them how to open it</p>
<ol>
<li>Open Terminal (What’s Terminal? It looks scary.)</li>
<li>Use <code class="language-plaintext highlighter-rouge">cd</code> to navigate to where you want.</li>
<li>Now run this special command…
and <strong>finally</strong> you are in the user-friendly land of Jupyter.</li>
</ol>
<p>Now of course all of these skills are useful, and necessary eventually, but it really bogs down the first lesson in minutia and inevitably leaves the student feeling a bit overwhelmed. There must be a better way! One solution is to set your student up with Jupyter Hub. They’ll just need to click a link and they’ll be up and running in no time! This is a great solution for a lot of cases, but it requires the instructor to set up a server and the student to have internet access, so this doesn’t fit all cases. “Why can’t I just double-click the notebook?” the student will ask (or be too embarrassed to ask). Well… um… why can’t you? Now you can. Here’s how.</p>
<p><a href="../../files/run_jupyter_notebook.zip">Download me!</a> and double-click to unpack and drag to Applications or where ever you want to keep it.</p>
<p>Navigate to a notebook in Finder, right-click and choose “Get Info”, then expand “Open with:” choose “Other…” from the dropdown menu. Now navigate to and select run_jupyter_notebook. Now select “Change All…”</p>
<p><img width="200" src="../../images/run_jupyter_notebook.png" title="change jupyter notebook settings" alt="change notebook settings" /></p>
<p>Now you can double-click your notebooks to start them!</p>
<h2 id="caveats">Caveats</h2>
<ul>
<li>This only works on Macs right now (sorry Windows. Linux users, y’all chose this life.)</li>
<li>Every time you double-click, it opens a new Terminal window.</li>
<li>This won’t run a notebook in a virtual or conda environment.</li>
</ul>
<p>You can still open notebooks the normal way if you need to have more control over how the notebook is launched.</p>Ben Dichterben.dichter@catalystneuro.comIf you use Jupyter on a regular basis, the steps to launch a notebook are probably second nature, but if you take a step back, it involves a lot of prior knowledge. A few times I’ve tried to bring brand new eager programmers into the glorious land of Python and Jupyter, but each time I found that the whole flow was really bogged down by this preamble that is pretty technical. I’ll give them an .ipynb file and then show them how to open ittenseflow2018-03-29T00:00:00-07:002018-03-29T00:00:00-07:00https://bendichter.com/posts/tenseflow<p><img width="400" src="https://github.com/bendichter/tenseflow/blob/master/static/screenshot.png?raw=true" title="tenseflow app" alt="tenseflow app" /></p>
<p>I was frustrated while changing the tense of a document, and decided to go down the deep dark rabbit hole of creating an
automatic tense changer. The basic usage is:</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kn">from</span> <span class="nn">tenseflow</span> <span class="kn">import</span> <span class="n">change_tense</span>
<span class="n">change_tense</span><span class="p">(</span><span class="s">'I will go to the store.'</span><span class="p">,</span> <span class="s">'past'</span><span class="p">)</span>
<span class="sa">u</span><span class="s">'I went to the store.'</span>
</code></pre></div></div>
<p>Little did I know, this is a really tough task, for a few reasons. For anyone who wants to venture down this path,
here are a few of the finer points you’ll need to deal with:</p>
<ol>
<li>Identifying verbs is harder than it looks. For instance, take the word “<u>vacuum</u>.” This word could be used as a noun,
(“Please hand me the <u>vacuum</u>.”) or verb (“Please <u>vacuum</u> the dining room.”) Vacuum is not a special word-
in fact if you think about it, <strong>most</strong> verbs in the English language can be used as nouns and <strong>most</strong> nouns can be used as verbs.
If you blindly convert any word that could be a verb, you’ll get nonsense like “Please hand me the <u>vacuumed</u>.”
Therefore, in order to properly tense-alter a passage, you need to first parse the sentence to determine what words are
are being used as verbs. You also need to parse their role in the sentence. For instance, infinitives do not change with
tense. (We don’t want e.g. “You asked me to <u>vacuumed</u>)”.</li>
<li>Once you have identified which word you want to change, there are so many irregular verbs and special rules, you
really need an entire dictionary to do this properly.</li>
<li>There are more tenses in English than you might realize. Common wisdom is that we have 3: past, present, and future.
In fact, there are 12, and each of them has three modes: affirmative, negative, and interrogative.</li>
</ol>
<p><img width="400" src="https://lessonsforenglish.com/wp-content/uploads/2019/12/12-Tenses-Formula-With-Examples.png" title="table of tenses" alt="table of tenses" /></p>
<ol>
<li>There are all sorts of cases where you would want to have multiple tenses in the same sentence, and there isn’t really
a good way to infer this automatically.</li>
</ol>
<p>Despite these obstacles, I managed to make a tool that works… OK. It comes with a web-app.
Check it out on GitHub <a href="https://github.com/bendichter/tenseflow">here</a>.</p>Ben Dichterben.dichter@catalystneuro.com