NCBIでゲノム配列が公開されています。
https://www.ncbi.nlm.nih.gov/nuccore/MN908947
Wuhan seafood market pneumonia virusなので、日本語訳すると武漢海鮮市場肺炎ウイルスでしょうか、、、使われたシーケンサーはIllumina、アセンブリにはMegahitが使用されているようです。
ひとまず何も考えずにBLASTかけてみました。
Sequences producing significant alignments: Max Total Query E Per. Description Score Score cover Value Ident Accession Wuhan seafood market pneumonia virus isolate Wuhan-Hu-1,... 55221 55221 100% 0.0 100.00 NC_045512.2 Wuhan seafood market pneumonia virus isolate Wuhan-Hu-1,... 55221 55221 100% 0.0 100.00 MN908947.3 Wuhan seafood market pneumonia virus isolate... 55171 55171 99% 0.0 99.98 MN975262.1 Wuhan seafood market pneumonia virus isolate... 55166 55166 99% 0.0 99.99 MN985325.1 Wuhan seafood market pneumonia virus isolate... 55153 55153 99% 0.0 99.97 MN988713.1 Wuhan seafood market pneumonia virus isolate... 55084 55084 99% 0.0 99.99 MN938384.1 Bat SARS-like coronavirus isolate bat-SL-CoVZC45, complete genome 26943 35336 95% 0.0 89.12 MG772933.1 Bat SARS-like coronavirus isolate bat-SL-CoVZXC21, complete... 22223 35276 94% 0.0 88.65 MG772934.1 SARS coronavirus ZS-C, complete genome 15213 22564 88% 0.0 82.34 AY395003.1 SARS coronavirus ZS-B, complete genome 15213 22600 88% 0.0 82.34 AY394996.1 SARS coronavirus SZ16, complete genome 15202 22531 88% 0.0 82.33 AY304488.1 SARS coronavirus SZ3, complete genome 15202 22529 88% 0.0 82.33 AY304486.1 SARS coronavirus GZ02, complete genome 15191 22548 88% 0.0 82.32 AY390556.1 SARS coronavirus BJ182-12, complete genome 15186 22483 88% 0.0 82.32 EU371564.1 SARS coronavirus HSZ-Bb, complete genome 15186 22276 87% 0.0 82.31 AY394985.1 SARS coronavirus CUHK-W1, complete genome 15186 22577 88% 0.0 82.32 AY278554.2 SARS coronavirus ZJ02, complete genome 15180 22548 88% 0.0 82.31 EU371559.1 SARS coronavirus Sin845, complete genome 15180 22507 88% 0.0 82.31 AY559093.1 SARS coronavirus HSZ-Bc, complete genome 15180 22566 88% 0.0 82.31 AY394994.1 SARS coronavirus HSZ-Cb, complete genome 15180 22526 88% 0.0 82.31 AY394986.1 Coronavirus BtRs-BetaCoV/YN2018B, complete genome 15176 22618 91% 0.0 82.32 MK211376.1 Bat SARS-like coronavirus isolate Rs4231, complete genome 15176 22534 91% 0.0 82.30 KY417146.1 SARS coronavirus isolate Tor2/FP1-10851, complete genome 15175 22417 88% 0.0 82.30 JX163927.1 SARS coronavirus isolate Tor2/FP1-10912, complete genome 15175 22424 88% 0.0 82.30 JX163926.1 SARS coronavirus isolate Tor2/FP1-10912, complete genome 15175 22417 88% 0.0 82.30 JX163923.1 SARS coronavirus HKU-39849 isolate UOB, complete genome 15175 22529 88% 0.0 82.30 JQ316196.1 SARS coronavirus P2, complete genome 15175 22450 88% 0.0 82.30 FJ882963.1 SARS coronavirus strain CV7, complete genome 15175 22574 88% 0.0 82.30 DQ898174.1 SARS coronavirus BJ202, complete genome 15175 22579 88% 0.0 82.30 AY864806.1 SARS Coronavirus CDC#200301157, complete genome 15175 22529 88% 0.0 82.30 AY714217.1 SARS coronavirus Sin850, complete genome 15175 22516 88% 0.0 82.30 AY559096.1 SARS coronavirus Sin847, complete genome 15175 22504 88% 0.0 82.30 AY559095.1 SARS coronavirus Sin849, complete genome 15175 22502 88% 0.0 82.30 AY559086.1 SARS coronavirus Sin848, complete genome 15175 22504 88% 0.0 82.30 AY559085.1 Severe acute respiratory syndrome-related coronavirus isolate... 15175 22568 88% 0.0 82.30 AY274119.3 SARS coronavirus HSR 1, complete genome 15175 22579 88% 0.0 82.30 AY323977.2 SARS coronavirus TW1, complete genome 15175 22539 88% 0.0 82.30 AY291451.1 SARS coronavirus TW5, complete genome 15175 22533 88% 0.0 82.30 AY502928.1 SARS coronavirus TW3, complete genome 15175 22526 88% 0.0 82.30 AY502926.1 SARS coronavirus TW10, complete genome 15175 22539 88% 0.0 82.30 AY502923.1 SARS coronavirus LC2, complete genome 15175 22511 88% 0.0 82.30 AY394999.1 SARS coronavirus LC1, complete genome 15175 22561 88% 0.0 82.30 AY394998.1 SARS coronavirus HSZ-Cc, complete genome 15175 22555 88% 0.0 82.30 AY394995.1 SARS coronavirus HGZ8L2, complete genome 15175 22561 88% 0.0 82.31 AY394993.1 SARS coronavirus HZS2-C, complete genome 15175 22566 88% 0.0 82.31 AY394992.1 SARS coronavirus HZS2-Fc, complete genome 15175 22566 88% 0.0 82.30 AY394991.1 SARS coronavirus HZS2-Fb, complete genome 15175 22548 88% 0.0 82.30 AY394987.1 SARS coronavirus HSZ2-A, complete genome 15175 22535 88% 0.0 82.30 AY394983.1 SARS coronavirus GZ-B, complete genome 15175 22461 88% 0.0 82.30 AY394978.1 SARS coronavirus PUMC02, complete genome 15175 22563 88% 0.0 82.30 AY357075.1 SARS coronavirus CUHK-Su10, complete genome 15175 22561 88% 0.0 82.30 AY282752.2 SARS coronavirus AS, complete genome 15175 22522 88% 0.0 82.30 AY427439.1 SARS coronavirus Sin2679, complete genome 15175 22528 88% 0.0 82.30 AY283796.1 SARS coronavirus TWY genomic RNA, complete genome 15175 22535 88% 0.0 82.30 AP006561.1 SARS coronavirus isolate Tor2/FP1-10895, complete genome 15173 22415 88% 0.0 82.30 JX163925.1 SARS coronavirus isolate Tor2/FP1-10895, complete genome 15171 22424 88% 0.0 82.30 JX163928.1 SARS coronavirus isolate Tor2/FP1-10851, complete genome 15169 22420 88% 0.0 82.30 JX163924.1 SARS coronavirus BJ182-8, complete genome 15169 22504 88% 0.0 82.30 EU371563.1 SARS coronavirus BJ182b, complete genome 15169 22509 88% 0.0 82.30 EU371561.1 SARS coronavirus BJ182a, complete genome 15169 22509 88% 0.0 82.30 EU371560.1 SARS coronavirus Urbani, complete genome 15169 22535 88% 0.0 82.30 AY278741.1 SARS coronavirus Sin3725V, complete genome 15169 22518 88% 0.0 82.29 AY559087.1 SARS coronavirus HZS2-D, complete genome 15169 22561 88% 0.0 82.30 AY394989.1 SARS coronavirus Sino3-11, complete genome 15169 22546 88% 0.0 82.30 AY485278.1 SARS coronavirus TW4, complete genome 15167 22526 88% 0.0 82.29 AY502927.1 BtRs-BetaCoV/YN2013, complete genome 15163 21673 85% 0.0 82.29 KJ473816.1 SARS coronavirus BJ182-4, complete genome 15158 22498 88% 0.0 82.29 EU371562.1 Coronavirus BtRs-BetaCoV/YN2018C, complete genome 15149 22403 88% 0.0 82.29 MK211377.1 Bat SARS-like coronavirus isolate Rf4092, complete genome 15134 22372 86% 0.0 82.26 KY417145.1 Coronavirus BtRs-BetaCoV/YN2018A, complete genome 15117 22368 88% 0.0 82.26 MK211375.1 SARS-related bat coronavirus isolate Anlong-111 orf1ab... 15084 16332 64% 0.0 82.35 KF294455.1 Bat coronavirus Cp/Yunnan2011, complete genome 15043 21834 86% 0.0 82.18 JX993988.1 BtRs-BetaCoV/HuB2013, complete genome 14970 22339 87% 0.0 82.20 KJ473814.1 Bat coronavirus (BtCoV/279/2005), complete genome 14916 22163 87% 0.0 82.16 DQ648857.1 Bat coronavirus Rp/Shaanxi2011, complete genome 14892 21819 88% 0.0 82.00 JX993987.1 SARS-related bat coronavirus isolate Longquan-140 orf1ab... 14759 23767 91% 0.0 81.89 KF294457.1 Coronavirus BtRl-BetaCoV/SC2018, complete genome 14731 21836 87% 0.0 82.02 MK211374.1 BtRf-BetaCoV/SX2013, complete genome 14722 21360 87% 0.0 81.82 KJ473813.1 BtRf-BetaCoV/HeB2013, complete genome 14683 21261 86% 0.0 81.79 KJ473812.1 BtRf-BetaCoV/JL2012, complete genome 14683 21067 86% 0.0 81.79 KJ473811.1 Bat SARS coronavirus HKU3-7, complete genome 14683 21958 88% 0.0 81.82 GQ153542.1 Bat SARS coronavirus HKU3-8, complete genome 14678 21996 88% 0.0 81.82 GQ153543.1 Bat coronavirus isolate Jiyuan-84, complete genome 14628 21416 87% 0.0 81.74 KY770860.1 Bat coronavirus (BtCoV/273/2005), complete genome 14556 21386 87% 0.0 81.66 DQ648856.1 Bat SARS coronavirus Rf1, complete genome 14556 21394 87% 0.0 81.66 DQ412042.1 Bat SARS coronavirus HKU3-12, complete genome 14550 21830 88% 0.0 81.68 GQ153547.1 SARS-related bat coronavirus isolate Jiyuan-331 orf1ab... 14539 15829 66% 0.0 81.66 KF294456.1 Recombinant coronavirus clone Bat SARS-CoV, complete sequence 14517 21878 89% 0.0 81.65 FJ211859.1 bat SARS coronavirus HKU3-2, complete genome 14512 21810 88% 0.0 81.64 DQ084199.1 Bat SARS coronavirus HKU3-5, complete genome 14506 21791 88% 0.0 81.63 GQ153540.1 Bat SARS coronavirus HKU3-4, complete genome 14506 21797 88% 0.0 81.63 GQ153539.1 Bat SARS coronavirus HKU3-11, complete genome 14501 21780 88% 0.0 81.63 GQ153546.1 Bat SARS coronavirus HKU3-1, complete genome 14501 21863 89% 0.0 81.63 DQ022305.2 bat SARS coronavirus HKU3-3, complete genome 14501 21849 89% 0.0 81.63 DQ084200.1 Bat SARS coronavirus HKU3-13, complete genome 14495 21769 88% 0.0 81.63 GQ153548.1 Bat SARS coronavirus HKU3-6, complete genome 14495 21797 88% 0.0 81.62 GQ153541.1 Bat SARS coronavirus HKU3-10, complete genome 14484 21758 88% 0.0 81.61 GQ153545.1 Bat SARS coronavirus HKU3-9, complete genome 14484 21758 88% 0.0 81.61 GQ153544.1 Bat coronavirus isolate JTMC15, complete genome 13501 20280 79% 0.0 82.94 KU182964.1 Bat coronavirus strain 16BO133, complete genome 13452 20206 79% 0.0 82.88 KY938558.1
上位は新型コロナウイルスの別のシーケンス結果で、それ以降は「SARS」や「コウモリ」の文字が見られます。
新型肺炎ウイルス、コウモリのSARSに近い 患者の特徴も類似―香港大
という記事もあったので、なるほどという感じですね。
せっかくなので、MN908947.3(Wuhan seafood market pneumonia virus)とMG772933.1(Bat SARS-like coronavirus isolate bat-SL-CoVZC45, complete genome)のゲノムを簡易的に比較してみました。
gepardというソフトウェアを使って、ドットプロット(Dot plot)を描いています。
ドットプロットは2つのゲノム配列をそれぞれ横軸と縦軸に割り当てて、塩基配列が一致した箇所に点を打つ手法です。2つのゲノム配列が完全に一致すると綺麗に斜めに線が現れます。今回の結果を見るとゲノム配列の全長に渡ってほぼ一致しています。ただ、右下の方に少し線が途切れている箇所が見られるようです。この部分がゲノム配列が一致していない箇所となります。
一致していない箇所が何か調べてみると、どうやらQHD43416.1(surface glycoprotein [Wuhan seafood market pneumonia virus])のようです。
MG772933.1(Bat SARS-like coronavirus isolate bat-SL-CoVZC45, complete genome)のsurface glycoproteinが変化したことで人にも感染するようになったということでしょうか。