武漢の新型コロナウイルスのゲノム配列

NCBIでゲノム配列が公開されています。
https://www.ncbi.nlm.nih.gov/nuccore/MN908947

Wuhan seafood market pneumonia virusなので、日本語訳すると武漢海鮮市場肺炎ウイルスでしょうか、、、使われたシーケンサーはIllumina、アセンブリにはMegahitが使用されているようです。
ひとまず何も考えずにBLASTかけてみました。

Sequences producing significant alignments:
                                                                  Max    Total Query   E   Per.                  
Description                                                       Score  Score cover Value Ident  Accession        
Wuhan seafood market pneumonia virus isolate Wuhan-Hu-1,...       55221  55221 100%  0.0   100.00 NC_045512.2      
Wuhan seafood market pneumonia virus isolate Wuhan-Hu-1,...       55221  55221 100%  0.0   100.00 MN908947.3       
Wuhan seafood market pneumonia virus isolate...                   55171  55171 99%   0.0   99.98  MN975262.1       
Wuhan seafood market pneumonia virus isolate...                   55166  55166 99%   0.0   99.99  MN985325.1       
Wuhan seafood market pneumonia virus isolate...                   55153  55153 99%   0.0   99.97  MN988713.1       
Wuhan seafood market pneumonia virus isolate...                   55084  55084 99%   0.0   99.99  MN938384.1       
Bat SARS-like coronavirus isolate bat-SL-CoVZC45, complete genome 26943  35336 95%   0.0   89.12  MG772933.1       
Bat SARS-like coronavirus isolate bat-SL-CoVZXC21, complete...    22223  35276 94%   0.0   88.65  MG772934.1       
SARS coronavirus ZS-C, complete genome                            15213  22564 88%   0.0   82.34  AY395003.1       
SARS coronavirus ZS-B, complete genome                            15213  22600 88%   0.0   82.34  AY394996.1       
SARS coronavirus SZ16, complete genome                            15202  22531 88%   0.0   82.33  AY304488.1       
SARS coronavirus SZ3, complete genome                             15202  22529 88%   0.0   82.33  AY304486.1       
SARS coronavirus GZ02, complete genome                            15191  22548 88%   0.0   82.32  AY390556.1       
SARS coronavirus BJ182-12, complete genome                        15186  22483 88%   0.0   82.32  EU371564.1       
SARS coronavirus HSZ-Bb, complete genome                          15186  22276 87%   0.0   82.31  AY394985.1       
SARS coronavirus CUHK-W1, complete genome                         15186  22577 88%   0.0   82.32  AY278554.2       
SARS coronavirus ZJ02, complete genome                            15180  22548 88%   0.0   82.31  EU371559.1       
SARS coronavirus Sin845, complete genome                          15180  22507 88%   0.0   82.31  AY559093.1       
SARS coronavirus HSZ-Bc, complete genome                          15180  22566 88%   0.0   82.31  AY394994.1       
SARS coronavirus HSZ-Cb, complete genome                          15180  22526 88%   0.0   82.31  AY394986.1       
Coronavirus BtRs-BetaCoV/YN2018B, complete genome                 15176  22618 91%   0.0   82.32  MK211376.1       
Bat SARS-like coronavirus isolate Rs4231, complete genome         15176  22534 91%   0.0   82.30  KY417146.1       
SARS coronavirus isolate Tor2/FP1-10851, complete genome          15175  22417 88%   0.0   82.30  JX163927.1       
SARS coronavirus isolate Tor2/FP1-10912, complete genome          15175  22424 88%   0.0   82.30  JX163926.1       
SARS coronavirus isolate Tor2/FP1-10912, complete genome          15175  22417 88%   0.0   82.30  JX163923.1       
SARS coronavirus HKU-39849 isolate UOB, complete genome           15175  22529 88%   0.0   82.30  JQ316196.1       
SARS coronavirus P2, complete genome                              15175  22450 88%   0.0   82.30  FJ882963.1       
SARS coronavirus strain CV7, complete genome                      15175  22574 88%   0.0   82.30  DQ898174.1       
SARS coronavirus BJ202, complete genome                           15175  22579 88%   0.0   82.30  AY864806.1       
SARS Coronavirus CDC#200301157, complete genome                   15175  22529 88%   0.0   82.30  AY714217.1       
SARS coronavirus Sin850, complete genome                          15175  22516 88%   0.0   82.30  AY559096.1       
SARS coronavirus Sin847, complete genome                          15175  22504 88%   0.0   82.30  AY559095.1       
SARS coronavirus Sin849, complete genome                          15175  22502 88%   0.0   82.30  AY559086.1       
SARS coronavirus Sin848, complete genome                          15175  22504 88%   0.0   82.30  AY559085.1       
Severe acute respiratory syndrome-related coronavirus isolate...  15175  22568 88%   0.0   82.30  AY274119.3       
SARS coronavirus HSR 1, complete genome                           15175  22579 88%   0.0   82.30  AY323977.2       
SARS coronavirus TW1, complete genome                             15175  22539 88%   0.0   82.30  AY291451.1       
SARS coronavirus TW5, complete genome                             15175  22533 88%   0.0   82.30  AY502928.1       
SARS coronavirus TW3, complete genome                             15175  22526 88%   0.0   82.30  AY502926.1       
SARS coronavirus TW10, complete genome                            15175  22539 88%   0.0   82.30  AY502923.1       
SARS coronavirus LC2, complete genome                             15175  22511 88%   0.0   82.30  AY394999.1       
SARS coronavirus LC1, complete genome                             15175  22561 88%   0.0   82.30  AY394998.1       
SARS coronavirus HSZ-Cc, complete genome                          15175  22555 88%   0.0   82.30  AY394995.1       
SARS coronavirus HGZ8L2, complete genome                          15175  22561 88%   0.0   82.31  AY394993.1       
SARS coronavirus HZS2-C, complete genome                          15175  22566 88%   0.0   82.31  AY394992.1       
SARS coronavirus HZS2-Fc, complete genome                         15175  22566 88%   0.0   82.30  AY394991.1       
SARS coronavirus HZS2-Fb, complete genome                         15175  22548 88%   0.0   82.30  AY394987.1       
SARS coronavirus HSZ2-A, complete genome                          15175  22535 88%   0.0   82.30  AY394983.1       
SARS coronavirus GZ-B, complete genome                            15175  22461 88%   0.0   82.30  AY394978.1       
SARS coronavirus PUMC02, complete genome                          15175  22563 88%   0.0   82.30  AY357075.1       
SARS coronavirus CUHK-Su10, complete genome                       15175  22561 88%   0.0   82.30  AY282752.2       
SARS coronavirus AS, complete genome                              15175  22522 88%   0.0   82.30  AY427439.1       
SARS coronavirus Sin2679, complete genome                         15175  22528 88%   0.0   82.30  AY283796.1       
SARS coronavirus TWY genomic RNA, complete genome                 15175  22535 88%   0.0   82.30  AP006561.1       
SARS coronavirus isolate Tor2/FP1-10895, complete genome          15173  22415 88%   0.0   82.30  JX163925.1       
SARS coronavirus isolate Tor2/FP1-10895, complete genome          15171  22424 88%   0.0   82.30  JX163928.1       
SARS coronavirus isolate Tor2/FP1-10851, complete genome          15169  22420 88%   0.0   82.30  JX163924.1       
SARS coronavirus BJ182-8, complete genome                         15169  22504 88%   0.0   82.30  EU371563.1       
SARS coronavirus BJ182b, complete genome                          15169  22509 88%   0.0   82.30  EU371561.1       
SARS coronavirus BJ182a, complete genome                          15169  22509 88%   0.0   82.30  EU371560.1       
SARS coronavirus Urbani, complete genome                          15169  22535 88%   0.0   82.30  AY278741.1       
SARS coronavirus Sin3725V, complete genome                        15169  22518 88%   0.0   82.29  AY559087.1       
SARS coronavirus HZS2-D, complete genome                          15169  22561 88%   0.0   82.30  AY394989.1       
SARS coronavirus Sino3-11, complete genome                        15169  22546 88%   0.0   82.30  AY485278.1       
SARS coronavirus TW4, complete genome                             15167  22526 88%   0.0   82.29  AY502927.1       
BtRs-BetaCoV/YN2013, complete genome                              15163  21673 85%   0.0   82.29  KJ473816.1       
SARS coronavirus BJ182-4, complete genome                         15158  22498 88%   0.0   82.29  EU371562.1       
Coronavirus BtRs-BetaCoV/YN2018C, complete genome                 15149  22403 88%   0.0   82.29  MK211377.1       
Bat SARS-like coronavirus isolate Rf4092, complete genome         15134  22372 86%   0.0   82.26  KY417145.1       
Coronavirus BtRs-BetaCoV/YN2018A, complete genome                 15117  22368 88%   0.0   82.26  MK211375.1       
SARS-related bat coronavirus isolate Anlong-111 orf1ab...         15084  16332 64%   0.0   82.35  KF294455.1       
Bat coronavirus Cp/Yunnan2011, complete genome                    15043  21834 86%   0.0   82.18  JX993988.1       
BtRs-BetaCoV/HuB2013, complete genome                             14970  22339 87%   0.0   82.20  KJ473814.1       
Bat coronavirus (BtCoV/279/2005), complete genome                 14916  22163 87%   0.0   82.16  DQ648857.1       
Bat coronavirus Rp/Shaanxi2011, complete genome                   14892  21819 88%   0.0   82.00  JX993987.1       
SARS-related bat coronavirus isolate Longquan-140 orf1ab...       14759  23767 91%   0.0   81.89  KF294457.1       
Coronavirus BtRl-BetaCoV/SC2018, complete genome                  14731  21836 87%   0.0   82.02  MK211374.1       
BtRf-BetaCoV/SX2013, complete genome                              14722  21360 87%   0.0   81.82  KJ473813.1       
BtRf-BetaCoV/HeB2013, complete genome                             14683  21261 86%   0.0   81.79  KJ473812.1       
BtRf-BetaCoV/JL2012, complete genome                              14683  21067 86%   0.0   81.79  KJ473811.1       
Bat SARS coronavirus HKU3-7, complete genome                      14683  21958 88%   0.0   81.82  GQ153542.1       
Bat SARS coronavirus HKU3-8, complete genome                      14678  21996 88%   0.0   81.82  GQ153543.1       
Bat coronavirus isolate Jiyuan-84, complete genome                14628  21416 87%   0.0   81.74  KY770860.1       
Bat coronavirus (BtCoV/273/2005), complete genome                 14556  21386 87%   0.0   81.66  DQ648856.1       
Bat SARS coronavirus Rf1, complete genome                         14556  21394 87%   0.0   81.66  DQ412042.1       
Bat SARS coronavirus HKU3-12, complete genome                     14550  21830 88%   0.0   81.68  GQ153547.1       
SARS-related bat coronavirus isolate Jiyuan-331 orf1ab...         14539  15829 66%   0.0   81.66  KF294456.1       
Recombinant coronavirus clone Bat SARS-CoV, complete sequence     14517  21878 89%   0.0   81.65  FJ211859.1       
bat SARS coronavirus HKU3-2, complete genome                      14512  21810 88%   0.0   81.64  DQ084199.1       
Bat SARS coronavirus HKU3-5, complete genome                      14506  21791 88%   0.0   81.63  GQ153540.1       
Bat SARS coronavirus HKU3-4, complete genome                      14506  21797 88%   0.0   81.63  GQ153539.1       
Bat SARS coronavirus HKU3-11, complete genome                     14501  21780 88%   0.0   81.63  GQ153546.1       
Bat SARS coronavirus HKU3-1, complete genome                      14501  21863 89%   0.0   81.63  DQ022305.2       
bat SARS coronavirus HKU3-3, complete genome                      14501  21849 89%   0.0   81.63  DQ084200.1       
Bat SARS coronavirus HKU3-13, complete genome                     14495  21769 88%   0.0   81.63  GQ153548.1       
Bat SARS coronavirus HKU3-6, complete genome                      14495  21797 88%   0.0   81.62  GQ153541.1       
Bat SARS coronavirus HKU3-10, complete genome                     14484  21758 88%   0.0   81.61  GQ153545.1       
Bat SARS coronavirus HKU3-9, complete genome                      14484  21758 88%   0.0   81.61  GQ153544.1       
Bat coronavirus isolate JTMC15, complete genome                   13501  20280 79%   0.0   82.94  KU182964.1       
Bat coronavirus strain 16BO133, complete genome                   13452  20206 79%   0.0   82.88  KY938558.1 

上位は新型コロナウイルスの別のシーケンス結果で、それ以降は「SARS」や「コウモリ」の文字が見られます。
新型肺炎ウイルス、コウモリのSARSに近い 患者の特徴も類似―香港大
という記事もあったので、なるほどという感じですね。

せっかくなので、MN908947.3(Wuhan seafood market pneumonia virus)とMG772933.1(Bat SARS-like coronavirus isolate bat-SL-CoVZC45, complete genome)のゲノムを簡易的に比較してみました。
gepardというソフトウェアを使って、ドットプロット(Dot plot)を描いています。


ドットプロットは2つのゲノム配列をそれぞれ横軸と縦軸に割り当てて、塩基配列が一致した箇所に点を打つ手法です。2つのゲノム配列が完全に一致すると綺麗に斜めに線が現れます。今回の結果を見るとゲノム配列の全長に渡ってほぼ一致しています。ただ、右下の方に少し線が途切れている箇所が見られるようです。この部分がゲノム配列が一致していない箇所となります。

一致していない箇所が何か調べてみると、どうやらQHD43416.1(surface glycoprotein [Wuhan seafood market pneumonia virus])のようです。
MG772933.1(Bat SARS-like coronavirus isolate bat-SL-CoVZC45, complete genome)のsurface glycoproteinが変化したことで人にも感染するようになったということでしょうか。

コメントする

メールアドレスが公開されることはありません。 * が付いている欄は必須項目です