<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" ><generator uri="https://jekyllrb.com/" version="3.10.0">Jekyll</generator><link href="https://shangll.github.io/feed.xml" rel="self" type="application/atom+xml" /><link href="https://shangll.github.io/" rel="alternate" type="text/html" /><updated>2026-03-19T05:03:04+00:00</updated><id>https://shangll.github.io/feed.xml</id><title type="html">SHANG, LINLIN</title><subtitle>personal description</subtitle><author><name>Linlin Shang</name><email>shanglinlin@lnu.edu.cn</email></author><entry><title type="html">MikTex基础教程</title><link href="https://shangll.github.io/posts/2025/07/py-notes/" rel="alternate" type="text/html" title="MikTex基础教程" /><published>2025-07-28T00:00:00+00:00</published><updated>2025-07-28T00:00:00+00:00</updated><id>https://shangll.github.io/posts/2025/07/latex</id><content type="html" xml:base="https://shangll.github.io/posts/2025/07/py-notes/"><![CDATA[<p><a href="https://shangll.notion.site/Install-MiKTex-TeXnicCenter-for-the-beginner-c2198175c0344d0db7470b908387ea6b?source=copy_link">CH.1 - LaTex环境安装及使用总览</a></p>

<p><a href="https://shangll.notion.site/454b03fa23d24743911d5b6883257af1?source=copy_link">CH.2 - 字体字号设置</a></p>

<p><a href="https://shangll.notion.site/68598476b7864fe6ba1627c98c3cc9b5?source=copy_link">CH.3 - 篇章结构</a></p>

<p><a href="https://shangll.notion.site/02dc95dfda1d42bcb2a93b9eacde60ac?source=copy_link">CH.4 - 特殊字符处理</a></p>

<p><a href="https://shangll.notion.site/445cc0c477b14cd4a04167adfeb4ac32?source=copy_link">CH.5 - 插图与表格</a></p>

<p><a href="https://shangll.notion.site/aafb4180259f4c09a8fd6a06041b1112?source=copy_link">CH.6 - 数学公式</a></p>

<p><a href="https://shangll.notion.site/9adf21756e15482589694da59ba80bc2?source=copy_link">CH.7 - 参考文献</a></p>

<p><a href="https://shangll.notion.site/11833271c9314974a439825c42bfd33a?source=copy_link">CH.8 - 自定义命令与环境</a></p>]]></content><author><name>Linlin Shang</name><email>shanglinlin@lnu.edu.cn</email></author><category term="LaTex" /><category term="MikTex" /><category term="TeXstudio" /><summary type="html"><![CDATA[CH.1 - LaTex环境安装及使用总览]]></summary></entry><entry><title type="html">Pytorch笔记 DL (随缘更新)</title><link href="https://shangll.github.io/posts/2024/08/pytorch-notes/" rel="alternate" type="text/html" title="Pytorch笔记 DL (随缘更新)" /><published>2024-08-13T00:00:00+00:00</published><updated>2024-08-13T00:00:00+00:00</updated><id>https://shangll.github.io/posts/2024/08/pytorch-notes</id><content type="html" xml:base="https://shangll.github.io/posts/2024/08/pytorch-notes/"><![CDATA[<p><a href="https://shangll.notion.site/W1D1-Making-Tensors-fd6c7365b7154019826166e212351882">W1D1: Making Tensors</a></p>

<p><a href="https://shangll.notion.site/W1D2-Learning-Hyperparameters-2da1d27148f0447ba5ddec240fed6b74?pvs=4">W1D2.1: Linear Deep Learning</a></p>

<p><a href="https://shangll.notion.site/W1D2-2-Linear-Deep-Learning-1dc69408a53280bb833dd22652ed64ef?pvs=4">W1D2.2: Linear Deep Learning</a></p>]]></content><author><name>Linlin Shang</name><email>shanglinlin@lnu.edu.cn</email></author><category term="python" /><category term="pytorch" /><category term="DNN" /><category term="CNN" /><category term="ANN" /><summary type="html"><![CDATA[W1D1: Making Tensors]]></summary></entry><entry><title type="html">Decoding/MVPA - ML</title><link href="https://shangll.github.io/posts/2025/06/MVPA/" rel="alternate" type="text/html" title="Decoding/MVPA - ML" /><published>2022-10-05T00:00:00+00:00</published><updated>2022-10-05T00:00:00+00:00</updated><id>https://shangll.github.io/posts/2025/06/MVPA</id><content type="html" xml:base="https://shangll.github.io/posts/2025/06/MVPA/"><![CDATA[<blockquote>
  <ul>
    <li><em>注：本文基于Bae 2018年J. Neurosci上的这篇Dissociable Decoding of Spatial Attention and Working Memory from EEG Oscillations and Sustained Potentials，强烈建议阅读原文。</em></li>
    <li><em>本文是仅针对懂EEG和fMRI实验并有处理经验、能看懂code、但无decoding经验的读者撰写的说明书；如无相关经验，不建议阅读；转载请注明出处</em></li>
  </ul>
</blockquote>

<p><strong><em>#结尾有彩蛋</em></strong></p>

<p><strong>Decoding在做什么、怎么做？</strong></p>

<p>人们在看不同类别图片的时候，大脑是有不同的反应模式的。如果大脑encode了这种刺激，两个种类的刺激的活动应该是不一样的。如果有一个很好的classifier，那么这两种不一样的活动就会被分类出来（所以这里我们可以看出来，decoding本质上就是在做representations）。比如我1个session 8个run，就可以training前7个run去test最后1个run。Decoding似乎fMRI用的比较早（不确定，印象中是这样），那时候还叫MVPA。这俩本质上是一个东西。那个V就是voxel的意思，到EEG中就变成了time course。如果EEG不叫decoding，那就得叫MTPA。仅此而已。</p>

<p><strong><em>Interim Summary</em></strong>：decoding的核心原理就是我有一堆data。这堆data是二元（或者多元也行）可分的。我可以从中分出一大部分，用一个很好的classifier去training data怎么去分类，然后用trained data去test剩下的data。就是这么简单。</p>

<p><strong>下一个问题就是怎么做decoding？</strong></p>

<p>首先我们loading data：</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code>
<span class="kn">import</span> <span class="nn">mne</span>
<span class="n">epochs</span> <span class="o">=</span> <span class="n">mne</span><span class="p">.</span><span class="n">read_epochs</span><span class="p">(</span><span class="s">'preprocessed_epochs_1.fif'</span><span class="p">)</span>
<span class="n">t_list</span> <span class="o">=</span> <span class="n">epochs</span><span class="p">.</span><span class="n">times</span>
<span class="n">labels</span> <span class="o">=</span> <span class="n">epochs</span><span class="p">.</span><span class="n">events</span><span class="p">[:,</span><span class="mi">2</span><span class="p">]</span>
<span class="n">subj_data</span> <span class="o">=</span> <span class="n">epochs</span><span class="p">.</span><span class="n">get_data</span><span class="p">()</span>
<span class="c1"># data structure: 640 trials * 27 channels * 750 times
</span></code></pre></div></div>

<p>Bae这一篇是每个trial有750个time points，每个被试做了640个trials，decode 27个channels。Decoding可以ERP based也可以band power based。Bae两个都做了，我这里只展示ERP based，因为步骤都一样。我们先对已经loaded的subject 1做decoding。这里插播一个步骤<strong><em>crop data</em></strong>，也就是选取特定的time window。这一步不是强制的，但非常推荐——尤其是当你marginally显著的时候。为啥？因为我们是针对time point解码。现在，我们手上的数据是250 Hz的，也就是1个时间点对应0.004”，750个time points就是3”。Bae文章中写了是onset前后1.5”。假设我们只关注1”前后，我们可以选0到1.5这个范围。相当于把baseline给剔除了。我们不做crop的话，就是把这3”全喂进去decode，效果肯定不如只做0-1.5这一段的。但话说回来，如果你差很多，这个操作没什么意义。数据这玩意，<strong><em>该显著怎么做都会显著，不显著的怎么做都不显著</em></strong>。</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code>
<span class="kn">import</span> <span class="nn">munpy</span> <span class="k">as</span> <span class="n">np</span>
<span class="c1"># crop data
</span><span class="n">t0</span><span class="p">,</span><span class="n">t1</span> <span class="o">=</span> <span class="mf">0.0</span><span class="p">,</span><span class="mf">1.5</span>
<span class="n">t0_indx</span><span class="p">,</span><span class="n">t1_indx</span> <span class="o">=</span> <span class="n">np</span><span class="p">.</span><span class="n">where</span><span class="p">(</span><span class="n">t_list</span><span class="o">&gt;=</span><span class="n">t0</span><span class="p">)[</span><span class="mi">0</span><span class="p">][</span><span class="mi">0</span><span class="p">],</span><span class="n">np</span><span class="p">.</span><span class="n">where</span><span class="p">(</span><span class="n">t_list</span><span class="o">&lt;=</span><span class="n">t1</span><span class="p">)[</span><span class="mi">0</span><span class="p">][</span><span class="o">-</span><span class="mi">1</span><span class="p">]</span>
</code></pre></div></div>
<p>这样就剩375个time points了。我们就针对这375个time points做decoding。</p>

<p>接下来就是Bae decoding方案的第一步：<strong><em>down sample</em></strong>。什么是down sample？就是缩减time points。Bae是每5个time points平均一下。这里平均步长没有限制，你想更平滑就多平均几个。有文献算过，无论怎么average，最后结果该显著的还会显著，不会影响最终结果。如果你实在不放心，可以0-5平均一下，1-6平均一下，2-7平均一下……以此类推给自己一个心理安慰。毕竟这样你的time resolution没有降低，但结果平滑了。</p>

<p>为什么要resampling？其实这一步不做也不会对结果有什么实质性的影响。有影响的是：我们是用marker的方式对每个trial进行了对齐，当样本很低的时候，并不是每个trial（1/250Hz=0.004秒）都有用一个完全的对齐，所以取一个time window（5个time points平均）把没有完全对齐的损失抹去。而肉眼可见的影响是，down sample会让结果看起来很平滑。另外还可以节能，属实降本增效了（除非你为了心理安慰只做平滑，那样计算时间会暴增）。如果不resample的话，data画出来会很毛躁，一堆锯齿看着不好看而已。有兴趣可以自己画画。还是那句话，<strong><em>该显著怎么做都会显著，不显著的怎么做都不显著</em></strong>。</p>

<p>我这里写了个sub function，也建议大家写code时候这么做，一个是看着清晰，另一方面方便重复利用——尤其是我现在是针对单个被试进行decoding。最后做统计检验肯定是group level，不可能单独被试自己去统计。</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code>
<span class="c1"># down sample
</span>
<span class="n">t_space</span> <span class="o">=</span> <span class="mi">5</span>
<span class="k">def</span> <span class="nf">resamp_t</span><span class="p">(</span><span class="n">data</span><span class="p">):</span>
    <span class="n">data_avg_t</span> <span class="o">=</span> <span class="n">np</span><span class="p">.</span><span class="n">zeros</span><span class="p">([</span><span class="n">data</span><span class="p">.</span><span class="n">shape</span><span class="p">[</span><span class="mi">0</span><span class="p">],</span><span class="n">data</span><span class="p">.</span><span class="n">shape</span><span class="p">[</span><span class="mi">1</span><span class="p">],</span><span class="n">data</span><span class="p">.</span><span class="n">shape</span><span class="p">[</span><span class="mi">2</span><span class="p">]])</span>
    <span class="k">for</span> <span class="n">t</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="n">t_points</span><span class="p">):</span>
        <span class="n">data_avg_t</span><span class="p">[:,:,</span><span class="n">t</span><span class="p">]</span> <span class="o">=</span> <span class="n">np</span><span class="p">.</span><span class="n">average</span><span class="p">(</span><span class="n">data</span><span class="p">[:,:,</span><span class="n">t</span><span class="o">*</span><span class="n">t_space</span><span class="p">:</span><span class="n">t</span><span class="o">*</span><span class="n">t_space</span><span class="o">+</span><span class="n">t_space</span><span class="p">],</span><span class="n">axis</span><span class="o">=</span><span class="mi">2</span><span class="p">)</span>

    <span class="k">return</span> <span class="n">data_avg_t</span>

<span class="n">subj_data_resamp</span> <span class="o">=</span> <span class="n">resamp_t</span><span class="p">(</span><span class="n">subj_data</span><span class="p">)</span>
</code></pre></div></div>

<p>接下来我们要降噪。如何降噪？<strong><em>average trials</em></strong>。这一步要分条件去做。之前有讲到，decoding是在做分类，我们需要告诉classifier哪些data是类别1的，哪些data是类别2的……以此类推。我们不可能把不同类别的trials平均到一起。以Bae的data为例，有16个orientation和16个location。我们手上的数据是16个orientation的，那么data structure就是16 orientations * 40 trials * 27 channels * 75 times。这一步是用来label的。这部分代码会和averge trials的放在一下，所以我们先继续往下看。</p>

<p>分类完毕后，就可以将trials每13个一平均了。也就是40个（实际上可以整除的是39个）trials分成3段，每段内部进行平均。为了让数据更稳定可靠，平均前最好将trials随机打乱一下。另外步长这个问题和down sample一样，取多少随你。按照Bae的做法，最后会得到16 orientations * 3 trials * 27 channels * 75 times这样一个data structure。</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code>
<span class="c1"># average trials
</span>
<span class="n">condN</span> <span class="o">=</span> <span class="mi">16</span>
<span class="n">trialN</span> <span class="o">=</span> <span class="mi">40</span>
<span class="n">trial_space</span> <span class="o">=</span> <span class="mi">13</span>
<span class="kn">import</span> <span class="nn">math</span>
<span class="n">trialN_final</span> <span class="o">=</span> <span class="n">math</span><span class="p">.</span><span class="n">floor</span><span class="p">(</span><span class="n">trialN</span><span class="o">/</span><span class="n">trial_space</span><span class="p">)</span>
<span class="n">chans</span> <span class="o">=</span> <span class="mi">27</span>
<span class="n">t_points</span> <span class="o">=</span> <span class="nb">len</span><span class="p">(</span><span class="n">t_list</span><span class="p">)</span>

<span class="k">def</span> <span class="nf">dict_to_arr</span><span class="p">(</span><span class="n">d</span><span class="p">):</span>
    <span class="k">return</span> <span class="n">np</span><span class="p">.</span><span class="n">array</span><span class="p">([</span><span class="n">item</span> <span class="k">for</span> <span class="n">item</span> <span class="ow">in</span> <span class="n">d</span><span class="p">.</span><span class="n">values</span><span class="p">()])</span>

<span class="k">def</span> <span class="nf">avg_trials</span><span class="p">(</span><span class="n">old_data</span><span class="p">):</span>
    <span class="n">data_final</span><span class="p">,</span><span class="n">label_final</span><span class="p">,</span><span class="n">data</span><span class="p">,</span><span class="n">label</span> <span class="o">=</span> \
        <span class="nb">dict</span><span class="p">(),</span><span class="nb">dict</span><span class="p">(),</span><span class="nb">dict</span><span class="p">(),</span><span class="nb">dict</span><span class="p">()</span>

    <span class="k">for</span> <span class="n">n</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="n">condN</span><span class="p">):</span>
        <span class="n">n_indx</span> <span class="o">=</span> <span class="n">np</span><span class="p">.</span><span class="n">where</span><span class="p">(</span><span class="n">labels</span><span class="o">==</span><span class="n">n</span><span class="p">)[</span><span class="mi">0</span><span class="p">]</span>
        <span class="n">data</span><span class="p">[</span><span class="n">n</span><span class="p">]</span> <span class="o">=</span> <span class="n">old_data</span><span class="p">[</span><span class="n">n_indx</span><span class="p">]</span>
        <span class="c1"># 40 trials * 27 channels * 75 times
</span>        <span class="n">label</span><span class="p">[</span><span class="n">n</span><span class="p">]</span> <span class="o">=</span> <span class="n">labels</span><span class="p">[</span><span class="n">n_indx</span><span class="p">]</span>
        <span class="c1"># 40 labels
</span>
        <span class="c1"># permutation 1st dimension
</span>        <span class="n">permu_data</span> <span class="o">=</span> <span class="n">np</span><span class="p">.</span><span class="n">random</span><span class="p">.</span><span class="n">permutation</span><span class="p">(</span><span class="n">data</span><span class="p">[</span><span class="n">n</span><span class="p">])</span>
        <span class="c1"># 40 trials * 27 channels * 75 times
</span>        <span class="s">'''
        permu_indx = np.random.permutation(trialN)
        permu_data = data[n][permu_indx,:,:]
        '''</span>
        <span class="n">n_list</span> <span class="o">=</span> <span class="p">[]</span>
        <span class="k">for</span> <span class="n">k</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="n">trialN_final</span><span class="p">):</span>
            <span class="n">n_list</span><span class="p">.</span><span class="n">append</span><span class="p">(</span>
                <span class="n">np</span><span class="p">.</span><span class="n">mean</span><span class="p">(</span><span class="n">permu_data</span><span class="p">[</span><span class="n">k</span><span class="o">*</span><span class="n">trial_space</span><span class="p">:</span><span class="n">k</span><span class="o">*</span><span class="n">trial_space</span><span class="o">+</span><span class="n">trial_space</span><span class="p">],</span><span class="n">axis</span><span class="o">=</span><span class="mi">0</span><span class="p">))</span>
        <span class="n">data_final</span><span class="p">[</span><span class="n">n</span><span class="p">]</span> <span class="o">=</span> <span class="n">np</span><span class="p">.</span><span class="n">array</span><span class="p">(</span><span class="n">n_list</span><span class="p">)</span>
        <span class="n">label_final</span><span class="p">[</span><span class="n">n</span><span class="p">]</span> <span class="o">=</span> <span class="n">np</span><span class="p">.</span><span class="n">repeat</span><span class="p">(</span><span class="n">n</span><span class="p">,</span><span class="n">k</span><span class="p">)</span>

        <span class="c1"># 3 trials * 27 channels * 75 times
</span>    <span class="n">data_final</span> <span class="o">=</span> <span class="n">dict_to_arr</span><span class="p">(</span><span class="n">data_final</span><span class="p">)</span>
    <span class="c1"># 16 conditions * 3 trials * 27 channels * 75 times
</span>    <span class="n">label_final</span> <span class="o">=</span> <span class="n">dict_to_arr</span><span class="p">(</span><span class="n">label_final</span><span class="p">)</span>
    <span class="c1"># 16 conditions * 3 labels
</span>
    <span class="n">data_final</span> <span class="o">=</span> <span class="n">np</span><span class="p">.</span><span class="n">reshape</span><span class="p">(</span>
        <span class="n">data_final</span><span class="p">,[</span><span class="nb">int</span><span class="p">(</span><span class="n">condN</span><span class="o">*</span><span class="n">trialN_final</span><span class="p">),</span><span class="n">chans</span><span class="p">,</span><span class="n">t_points</span><span class="p">])</span>
    <span class="c1"># 48 trials * 27 channels * 75 times
</span>    <span class="n">label_final</span> <span class="o">=</span> <span class="n">np</span><span class="p">.</span><span class="n">reshape</span><span class="p">(</span><span class="n">label_final</span><span class="p">,[</span><span class="nb">int</span><span class="p">(</span><span class="n">condN</span><span class="o">*</span><span class="n">trialN_final</span><span class="p">)])</span>
    <span class="c1"># 48 labels
</span>
    <span class="k">return</span> <span class="n">data_final</span><span class="p">,</span><span class="n">label_final</span>
</code></pre></div></div>

<p>至此，data预处理完成。我们可以正式开始<strong><em>decoding</em></strong>了。decoding最核心的原理就是把data拆成两个set，一个做traing，一个做testing。实际操作中，我们不可能只有两个data set，一般会用K-fold拆成几组——比如3折，然后针对每个time point用前两组test后一组；再重分3折，用后两组test前一组；最后分3折，用两边test中间那组。这样叫做cross validation，其实就是每次都重新拆分的LOOCV版本（cross validation有好多种，有兴趣可以查一下，一搜一大把）。每次test都会得到一个accuracy，得到3个ACC平均后就是decoding的正确率。最后得到75个time points的正确率。这里其实可以用AUC（个人常用）。3折是Bae原文操作，也是一般training的默认项（有时候是5）。个人一般用10折，结果基本就很稳定了。</p>

<p>这里需要注意一点，用K-fold一定用<strong>StratifiedKFold</strong>（不要用KFold这个function！！！），这样可以保证组别平衡。</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code>
<span class="kn">from</span> <span class="nn">sklearn.model_selection</span> <span class="kn">import</span> <span class="n">StratifiedKFold</span>
<span class="kn">from</span> <span class="nn">sklearn.preprocessing</span> <span class="kn">import</span> <span class="n">StandardScaler</span>
<span class="kn">from</span> <span class="nn">sklearn.svm</span> <span class="kn">import</span> <span class="n">SVC</span>
<span class="kn">from</span> <span class="nn">sklearn.metrics</span> <span class="kn">import</span> <span class="n">roc_auc_score</span>
<span class="kn">import</span> <span class="nn">scipy.stats</span>
<span class="kn">from</span> <span class="nn">mne.stats</span> <span class="kn">import</span> <span class="n">permutation_cluster_1samp_test</span><span class="p">,</span> \
    <span class="n">permutation_cluster_test</span>

<span class="n">fdN</span> <span class="o">=</span> <span class="mi">3</span>
<span class="n">rpN</span> <span class="o">=</span> <span class="mi">10</span>
<span class="n">n_permutations</span> <span class="o">=</span> <span class="mi">1000</span>
<span class="n">chance_crit</span> <span class="o">=</span> <span class="mf">0.5</span>
<span class="n">p_crit</span> <span class="o">=</span> <span class="mf">0.05</span>

<span class="k">def</span> <span class="nf">deco_in</span><span class="p">(</span><span class="n">old_data</span><span class="p">):</span>
    <span class="n">accs</span><span class="o">=</span> <span class="n">np</span><span class="p">.</span><span class="n">zeros</span><span class="p">([</span><span class="n">rpN</span><span class="p">,</span><span class="n">t_points</span><span class="p">,</span><span class="n">fdN</span><span class="p">])</span>
    <span class="k">for</span> <span class="n">n</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="n">rpN</span><span class="p">):</span>
        <span class="c1"># average trials
</span>        <span class="n">data_final</span><span class="p">,</span><span class="n">label_final</span> <span class="o">=</span> <span class="n">avg_trials</span><span class="p">(</span><span class="n">old_data</span><span class="p">)</span>

        <span class="k">for</span> <span class="n">t</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="n">t_points</span><span class="p">):</span>
            <span class="n">data_t</span> <span class="o">=</span> <span class="n">data_final</span><span class="p">[:,:,</span><span class="n">t</span><span class="p">]</span>
            <span class="c1"># 48 trials * 27 channels
</span>            <span class="n">seed_list</span> <span class="o">=</span> <span class="n">np</span><span class="p">.</span><span class="n">random</span><span class="p">.</span><span class="n">randint</span><span class="p">(</span><span class="n">t_points</span><span class="p">)</span>
            <span class="n">kf</span> <span class="o">=</span> <span class="n">StratifiedKFold</span><span class="p">(</span><span class="n">n_splits</span><span class="o">=</span><span class="n">fdN</span><span class="p">,</span><span class="n">shuffle</span><span class="o">=</span><span class="bp">True</span><span class="p">,</span><span class="n">random_state</span><span class="o">=</span><span class="n">seed_list</span><span class="p">)</span>
            <span class="c1"># shuffle=True if no randomly averaging trials
</span>
            <span class="n">fdN_indx</span> <span class="o">=</span> <span class="mi">0</span>
            <span class="k">for</span> <span class="n">train_indx</span><span class="p">,</span><span class="n">test_indx</span> <span class="ow">in</span> <span class="n">kf</span><span class="p">.</span><span class="n">split</span><span class="p">(</span><span class="n">data_t</span><span class="p">,</span><span class="n">label_final</span><span class="p">):</span>
                <span class="n">data_train</span><span class="p">,</span><span class="n">data_test</span><span class="p">,</span><span class="n">label_train</span><span class="p">,</span><span class="n">label_test</span> <span class="o">=</span> \
                <span class="n">data_t</span><span class="p">[</span><span class="n">train_indx</span><span class="p">],</span><span class="n">data_t</span><span class="p">[</span><span class="n">test_indx</span><span class="p">],</span><span class="n">label_final</span><span class="p">[</span><span class="n">train_indx</span><span class="p">],</span><span class="n">label_final</span><span class="p">[</span><span class="n">test_indx</span><span class="p">]</span>

                <span class="c1"># normalization
</span>                <span class="n">scaler</span> <span class="o">=</span> <span class="n">StandardScaler</span><span class="p">()</span>
                <span class="n">data_train</span> <span class="o">=</span> <span class="n">scaler</span><span class="p">.</span><span class="n">fit_transform</span><span class="p">(</span><span class="n">data_train</span><span class="p">)</span>
                <span class="n">data_test</span> <span class="o">=</span> <span class="n">scaler</span><span class="p">.</span><span class="n">transform</span><span class="p">(</span><span class="n">data_test</span><span class="p">)</span>
                <span class="n">svm</span> <span class="o">=</span> <span class="n">SVC</span><span class="p">(</span><span class="n">kernel</span><span class="o">=</span><span class="s">'linear'</span><span class="p">)</span>
                <span class="n">svm</span><span class="p">.</span><span class="n">fit</span><span class="p">(</span><span class="n">data_train</span><span class="p">,</span><span class="n">label_train</span><span class="p">)</span>
                <span class="n">acc</span> <span class="o">=</span> <span class="n">svm</span><span class="p">.</span><span class="n">score</span><span class="p">(</span><span class="n">data_test</span><span class="p">,</span><span class="n">label_test</span><span class="p">)</span>
                <span class="n">accs</span><span class="p">[</span><span class="n">n</span><span class="p">,</span><span class="n">t</span><span class="p">,</span><span class="n">fdN_indx</span><span class="p">]</span> <span class="o">=</span> <span class="n">acc</span>

                <span class="n">fdN_indx</span> <span class="o">+=</span> <span class="mi">1</span>

    <span class="n">acc_list</span> <span class="o">=</span> <span class="n">np</span><span class="p">.</span><span class="n">mean</span><span class="p">(</span><span class="n">accs</span><span class="p">,</span><span class="n">axis</span><span class="o">=</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span><span class="mi">2</span><span class="p">))</span>
    <span class="k">return</span> <span class="n">acc_list</span>

<span class="n">acc_list</span> <span class="o">=</span> <span class="n">deco_in</span><span class="p">(</span><span class="n">subj_data_resamp</span><span class="p">)</span>
</code></pre></div></div>

<p>这是个within task的decoding，一般会比较推荐做cross-task。原理都是一样的，甚至更简单。</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code>
<span class="c1"># across-task decoding
</span>
<span class="k">def</span> <span class="nf">deco_cx</span><span class="p">(</span><span class="n">task1</span><span class="p">,</span><span class="n">task2</span><span class="p">,</span><span class="n">labels1</span><span class="p">,</span><span class="n">labels2</span><span class="p">):</span>
    <span class="n">acc_list</span> <span class="o">=</span> <span class="n">np</span><span class="p">.</span><span class="n">zeros</span><span class="p">([</span><span class="n">t_points</span><span class="p">])</span>

    <span class="c1"># decoding based on epoch
</span>    <span class="k">for</span> <span class="n">t</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="n">t_points</span><span class="p">):</span>
        <span class="n">data_train</span> <span class="o">=</span> <span class="n">task1</span><span class="p">[:,:,</span><span class="n">t</span><span class="p">]</span>
        <span class="n">data_test</span> <span class="o">=</span> <span class="n">task2</span><span class="p">[:,:,</span><span class="n">t</span><span class="p">]</span>

        <span class="n">scaler</span> <span class="o">=</span> <span class="n">StandardScaler</span><span class="p">()</span>
        <span class="n">x_train</span> <span class="o">=</span> <span class="n">scaler</span><span class="p">.</span><span class="n">fit_transform</span><span class="p">(</span><span class="n">data_train</span><span class="p">)</span>
        <span class="n">x_test</span> <span class="o">=</span> <span class="n">scaler</span><span class="p">.</span><span class="n">transform</span><span class="p">(</span><span class="n">data_test</span><span class="p">)</span>
        <span class="n">clf</span> <span class="o">=</span> <span class="n">SVC</span><span class="p">(</span><span class="n">kernel</span><span class="o">=</span><span class="s">'rbf'</span><span class="p">,</span><span class="n">class_weight</span><span class="o">=</span><span class="s">'balanced'</span><span class="p">,</span><span class="n">max_iter</span><span class="o">=</span><span class="n">rpN</span><span class="p">)</span>
        <span class="n">clf</span><span class="p">.</span><span class="n">fit</span><span class="p">(</span><span class="n">x_train</span><span class="p">,</span><span class="n">labels1</span><span class="p">)</span>
        <span class="n">pred</span> <span class="o">=</span> <span class="n">clf</span><span class="p">.</span><span class="n">predict</span><span class="p">(</span><span class="n">x_test</span><span class="p">)</span>
        <span class="n">acc_list</span><span class="p">[</span><span class="n">t</span><span class="p">]</span> <span class="o">=</span> <span class="n">roc_auc_score</span><span class="p">(</span><span class="n">labels2</span><span class="p">,</span><span class="n">pred</span><span class="p">)</span>
    <span class="k">return</span> <span class="n">acc_list</span>
</code></pre></div></div>

<p>最后一步，怎么看<strong><em>group-level</em></strong>的结果？我们现在做的是单被试的decoding。你可以写个for loop，这样把全部被试都decode一遍。30个被试的话每个time points就有30笔data，60个被试就60个data points，你有多少个被试就是多少个data points。然后我们就可以针对每个time point做1-sample t test，跟chance level比较。二分的话就是0.5了，Bae那篇是0.0625。推荐用cluster t test去做，我这里的code是2个分类的，所以那个chance level是0.5。</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code>
<span class="k">def</span> <span class="nf">find_sig</span><span class="p">(</span><span class="n">clu</span><span class="p">,</span><span class="n">clu_p</span><span class="p">):</span>
    <span class="n">acc_sig</span><span class="p">,</span><span class="n">grp_sig</span> <span class="o">=</span> <span class="n">t_points</span><span class="o">*</span><span class="p">[</span><span class="mi">0</span><span class="p">],</span><span class="n">t_points</span><span class="o">*</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span>
    <span class="n">grp_label</span> <span class="o">=</span> <span class="mi">0</span>

    <span class="k">for</span> <span class="n">c</span><span class="p">,</span><span class="n">p</span> <span class="ow">in</span> <span class="nb">zip</span><span class="p">(</span><span class="n">clu</span><span class="p">,</span><span class="n">clu_p</span><span class="p">):</span>
        <span class="k">if</span> <span class="n">p</span><span class="o">&lt;=</span><span class="n">p_crit</span><span class="p">:</span>
            <span class="n">grp_label</span> <span class="o">+=</span> <span class="mi">1</span>
            <span class="n">acc_sig</span><span class="p">[</span><span class="n">c</span><span class="p">[</span><span class="mi">0</span><span class="p">][</span><span class="mi">0</span><span class="p">]:(</span><span class="n">c</span><span class="p">[</span><span class="mi">0</span><span class="p">][</span><span class="o">-</span><span class="mi">1</span><span class="p">]</span><span class="o">+</span><span class="mi">1</span><span class="p">)]</span> <span class="o">=</span> \
                <span class="p">[</span><span class="mi">1</span><span class="p">]</span><span class="o">*</span><span class="nb">len</span><span class="p">(</span><span class="n">c</span><span class="p">[</span><span class="mi">0</span><span class="p">])</span>
            <span class="n">grp_sig</span><span class="p">[</span><span class="n">c</span><span class="p">[</span><span class="mi">0</span><span class="p">][</span><span class="mi">0</span><span class="p">]:(</span><span class="n">c</span><span class="p">[</span><span class="mi">0</span><span class="p">][</span><span class="o">-</span><span class="mi">1</span><span class="p">]</span><span class="o">+</span><span class="mi">1</span><span class="p">)]</span> <span class="o">=</span> \
                <span class="p">[</span><span class="n">grp_label</span><span class="p">]</span><span class="o">*</span><span class="nb">len</span><span class="p">(</span><span class="n">c</span><span class="p">[</span><span class="mi">0</span><span class="p">])</span>

    <span class="k">return</span> <span class="n">acc_sig</span><span class="p">,</span><span class="n">grp_sig</span>

<span class="k">def</span> <span class="nf">clu_permu_1samp_t</span><span class="p">(</span><span class="n">acc_data</span><span class="p">):</span>
    <span class="n">threshold</span> <span class="o">=</span> <span class="bp">None</span>
    <span class="n">tail</span> <span class="o">=</span> <span class="mi">0</span>
    <span class="n">degrees_of_freedom</span> <span class="o">=</span> <span class="nb">len</span><span class="p">(</span><span class="n">acc_data</span><span class="p">)</span> <span class="o">-</span> <span class="mi">1</span>
    <span class="n">t_thresh</span> <span class="o">=</span> <span class="n">scipy</span><span class="p">.</span><span class="n">stats</span><span class="p">.</span><span class="n">t</span><span class="p">.</span><span class="n">ppf</span><span class="p">(</span><span class="mi">1</span><span class="o">-</span><span class="n">p_crit</span><span class="o">/</span><span class="mi">2</span><span class="p">,</span><span class="n">df</span><span class="o">=</span><span class="n">degrees_of_freedom</span><span class="p">)</span>

    <span class="n">t_obs</span><span class="p">,</span><span class="n">clu</span><span class="p">,</span><span class="n">clu_p</span><span class="p">,</span><span class="n">H0</span> <span class="o">=</span> <span class="n">permutation_cluster_1samp_test</span><span class="p">(</span>
        <span class="n">acc_data</span><span class="o">-</span><span class="n">chance_crit</span><span class="p">,</span><span class="n">n_permutations</span><span class="o">=</span><span class="n">n_permutations</span><span class="p">,</span>
        <span class="n">threshold</span><span class="o">=</span><span class="n">t_thresh</span><span class="p">,</span><span class="n">tail</span><span class="o">=</span><span class="n">tail</span><span class="p">,</span><span class="n">out_type</span><span class="o">=</span><span class="s">'indices'</span><span class="p">,</span><span class="n">verbose</span><span class="o">=</span><span class="bp">True</span><span class="p">)</span>
    <span class="n">acc_sig</span><span class="p">,</span><span class="n">grp_sig</span> <span class="o">=</span> <span class="n">find_sig</span><span class="p">(</span><span class="n">clu</span><span class="p">,</span><span class="n">clu_p</span><span class="p">)</span>

    <span class="k">return</span> <span class="n">acc_sig</span><span class="p">,</span><span class="n">grp_sig</span>
</code></pre></div></div>

<p>这样就可以得到最终结果了。acc_sig是一个label list，找哪几个index对应的values是1，那一段就是显著的time window。这么写是我data习惯存成dataframe。我会直接加一个column，后面画图比较好画。没别的意思。你也可以直接print time points，看哪个time windows显著。grp_sig是存有几个cluster的。有时候就一段time window显著，它的label就只有1，这个不重要。</p>

<p>以上就是整个decoding的教程。All done！</p>

<p><strong>关于彩蛋</strong></p>

<p>第一个彩蛋就其实MNE有几个专门decoding的functions，所以我们其实可以直接拿来用，不用在这搞人工智障了。同样within和cross的code我都放出来了：</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code>
<span class="c1"># within-task decoding
</span>
<span class="k">def</span> <span class="nf">gen_decoding_in</span><span class="p">(</span><span class="n">X</span><span class="p">,</span><span class="n">y</span><span class="p">):</span>
    <span class="n">clf</span> <span class="o">=</span> <span class="n">make_pipeline</span><span class="p">(</span><span class="n">StandardScaler</span><span class="p">(),</span><span class="n">LinearModel</span><span class="p">(</span><span class="n">LogisticRegression</span><span class="p">(</span><span class="n">solver</span><span class="o">=</span><span class="s">'liblinear'</span><span class="p">)))</span>
    <span class="n">time_gen</span> <span class="o">=</span> <span class="n">GeneralizingEstimator</span><span class="p">(</span><span class="n">clf</span><span class="p">,</span><span class="n">n_jobs</span><span class="o">=</span><span class="n">jobN</span><span class="p">,</span><span class="n">scoring</span><span class="o">=</span><span class="n">scoring</span><span class="p">,</span><span class="n">verbose</span><span class="o">=</span><span class="bp">True</span><span class="p">)</span>

    <span class="n">scores</span> <span class="o">=</span> <span class="n">cross_val_multiscore</span><span class="p">(</span><span class="n">time_gen</span><span class="p">,</span><span class="n">X</span><span class="p">,</span><span class="n">y</span><span class="p">,</span><span class="n">cv</span><span class="o">=</span><span class="n">fdN</span><span class="p">,</span><span class="n">n_jobs</span><span class="o">=</span><span class="n">jobN</span><span class="p">)</span>
    <span class="n">mean_score</span> <span class="o">=</span> <span class="n">np</span><span class="p">.</span><span class="n">mean</span><span class="p">(</span><span class="n">scores</span><span class="p">,</span><span class="n">axis</span><span class="o">=</span><span class="mi">0</span><span class="p">)</span>
    <span class="n">mean_score_diag</span> <span class="o">=</span> <span class="n">np</span><span class="p">.</span><span class="n">diag</span><span class="p">(</span><span class="n">mean_score</span><span class="p">)</span>

    <span class="k">return</span> <span class="n">scores</span><span class="p">,</span><span class="n">mean_score</span><span class="p">,</span><span class="n">mean_score_diag</span>

<span class="c1"># across-task decoding
</span>
<span class="k">def</span> <span class="nf">gen_decoding_cx</span><span class="p">(</span><span class="n">task1</span><span class="p">,</span><span class="n">task2</span><span class="p">,</span><span class="n">labels1</span><span class="p">,</span><span class="n">labels2</span><span class="p">):</span>
    <span class="n">clf</span> <span class="o">=</span> <span class="n">make_pipeline</span><span class="p">(</span><span class="n">StandardScaler</span><span class="p">(),</span><span class="n">LogisticRegression</span><span class="p">(</span><span class="n">solver</span><span class="o">=</span><span class="s">'liblinear'</span><span class="p">))</span>
    <span class="n">time_gen</span> <span class="o">=</span> <span class="n">GeneralizingEstimator</span><span class="p">(</span><span class="n">clf</span><span class="p">,</span><span class="n">scoring</span><span class="o">=</span><span class="s">'roc_auc'</span><span class="p">,</span><span class="n">n_jobs</span><span class="o">=</span><span class="bp">None</span><span class="p">,</span><span class="n">verbose</span><span class="o">=</span><span class="bp">True</span><span class="p">)</span>

    <span class="n">time_gen</span><span class="p">.</span><span class="n">fit</span><span class="p">(</span><span class="n">X</span><span class="o">=</span><span class="n">task1</span><span class="p">,</span><span class="n">y</span><span class="o">=</span><span class="n">labels1</span><span class="p">)</span>
    <span class="n">scores</span> <span class="o">=</span> <span class="n">time_gen</span><span class="p">.</span><span class="n">score</span><span class="p">(</span><span class="n">X</span><span class="o">=</span><span class="n">task2</span><span class="p">,</span><span class="n">y</span><span class="o">=</span><span class="n">labels2</span><span class="p">)</span>
    <span class="n">scores_diag</span> <span class="o">=</span> <span class="n">np</span><span class="p">.</span><span class="n">diag</span><span class="p">(</span><span class="n">scores</span><span class="p">)</span>

    <span class="k">return</span> <span class="n">scores</span><span class="p">,</span><span class="n">scores_diag</span>
</code></pre></div></div>

<p>彩蛋2也是个教程。可以搜一下微信公众号<strong><em>“路同学”</em></strong>。先声明我不认识这个人，不是给他打广告！！！路同学有一篇就是讲如何decoding的<a href="https://mp.weixin.qq.com/s/hg8Gm1mMLlPP6ws5_JXVEA">全网独一无二基于Python逐行代码手把手实现与讲解的EEG Decoding视频教程</a>，付费的，但不贵且值得。他也是基于Bae这篇讲的，真的是hand in hand。我是真的做不到那么耐心……就一暴躁死废宅……如果你看我这篇有点吃力，去看他的教程准没错。</p>

<p>最后，强烈建议去看Bae那篇原文，写的非常详细。我就是根据他那篇写出来的code——当年还是0基础去写的。所以你知道我为什么会反复强调一定一定一定去看原文了吧。</p>

<p>全文完结</p>]]></content><author><name>Linlin Shang</name><email>shanglinlin@lnu.edu.cn</email></author><category term="python" /><category term="Decoding" /><category term="MVPA" /><category term="ML" /><summary type="html"><![CDATA[注：本文基于Bae 2018年J. Neurosci上的这篇Dissociable Decoding of Spatial Attention and Working Memory from EEG Oscillations and Sustained Potentials，强烈建议阅读原文。 本文是仅针对懂EEG和fMRI实验并有处理经验、能看懂code、但无decoding经验的读者撰写的说明书；如无相关经验，不建议阅读；转载请注明出处]]></summary></entry><entry><title type="html">Python基础教程</title><link href="https://shangll.github.io/posts/2016/08/py-notes/" rel="alternate" type="text/html" title="Python基础教程" /><published>2016-08-25T00:00:00+00:00</published><updated>2016-08-25T00:00:00+00:00</updated><id>https://shangll.github.io/posts/2016/08/py-notes</id><content type="html" xml:base="https://shangll.github.io/posts/2016/08/py-notes/"><![CDATA[<p>参考Python for Everyone这本书的第二本，基本上为该书中文直翻。</p>

<p><a href="https://shangll.notion.site/CH-1-INTRO-c1ed960045ce4f868619d96d12b2101a?pvs=4">CH.1 - INTRO (可跳过)</a></p>

<p><a href="https://shangll.notion.site/CH-2-NUMBERS-STRINGS-cb586013495943a68e5dc870f84bb63d?pvs=4">CH.2 - NUMBERS &amp; STRINGS</a></p>

<p><a href="https://shangll.notion.site/CH-3-DECISION-8a7794a6c2e24824bef18f6b21446ef2?pvs=4">CH.3 - DECISION</a></p>

<p><a href="https://shangll.notion.site/CH-4-LOOPS-d945fbcfbeaa43d9a73bf59a8636cea0?pvs=4">CH.4 - LOOPS</a></p>

<p><a href="https://shangll.notion.site/CH-5-FUNCTIONS-cc481748b4b1499cb8608601529b71bc?pvs=4">CH.5 - FUNCTIONS</a></p>

<p><a href="https://shangll.notion.site/CH-6-LISTS-b81ca65d4a434cff9f62b716ce5c92ad?pvs=4">CH.6 - LISTS</a></p>

<p><a href="https://shangll.notion.site/CH-7-FILES-36618c5728364f8b8164a087ca78a068?pvs=4">CH.7 - FILES</a></p>

<p><a href="https://shangll.notion.site/CH-8-SET-AND-DICTIONARIES-fa99ddf7670e43a0bd2b66932124ead1?pvs=4">CH.8 - SET AND DICTIONARIES</a></p>

<p><a href="https://shangll.notion.site/CH-9-CLASS-58a320dfc896486eb2ec32aadbbe2f1c?pvs=4">CH.9 - CLASS</a></p>]]></content><author><name>Linlin Shang</name><email>shanglinlin@lnu.edu.cn</email></author><category term="python" /><summary type="html"><![CDATA[参考Python for Everyone这本书的第二本，基本上为该书中文直翻。]]></summary></entry></feed>