<?xml version="1.0" encoding="UTF-8" ?>

<bugzilla version="5.2"
          urlbase="https://bugzilla.altlinux.org/"
          
          maintainer="jenya@basealt.ru"
>

    <bug>
          <bug_id>31460</bug_id>
          
          <creation_ts>2015-11-09 13:26:28 +0300</creation_ts>
          <short_desc>wp2git:  Import Wikipedia page history to git</short_desc>
          <delta_ts>2015-11-10 01:58:43 +0300</delta_ts>
          <reporter_accessible>1</reporter_accessible>
          <cclist_accessible>1</cclist_accessible>
          <classification_id>4</classification_id>
          <classification>Development</classification>
          <product>New/proposed packages</product>
          <component>Обычный репозиторий</component>
          <version>не указана</version>
          <rep_platform>all</rep_platform>
          <op_sys>Linux</op_sys>
          <bug_status>NEW</bug_status>
          <resolution></resolution>
          
          
          <bug_file_loc>http://blog.thecybershadow.net/2010/06/16/import-wikipedia-page-history-to-git/</bug_file_loc>
          <status_whiteboard></status_whiteboard>
          <keywords></keywords>
          <priority>P3</priority>
          <bug_severity>normal</bug_severity>
          <target_milestone>---</target_milestone>
          
          <blocked>31414</blocked>
          <everconfirmed>1</everconfirmed>
          <reporter name="Ivan Zakharyaschev">imz</reporter>
          <assigned_to name="Andrey Cherepanov">cas</assigned_to>
          <cc>viy</cc>
          
          <qa_contact name="Andrey Cherepanov">cas</qa_contact>

      

      

      

          <comment_sort_order>oldest_to_newest</comment_sort_order>  
          <long_desc isprivate="0" >
    <commentid>153564</commentid>
    <comment_count>0</comment_count>
    <who name="Ivan Zakharyaschev">imz</who>
    <bug_when>2015-11-09 13:26:28 +0300</bug_when>
    <thetext>wp2git:  Import Wikipedia page history to git

https://github.com/CyberShadow/wp2git is the original in D.

https://github.com/dlenski/wp2git is a fork in Python (based on mwclient -- present in SIsyphus).</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>153566</commentid>
    <comment_count>1</comment_count>
    <who name="Ivan Zakharyaschev">imz</who>
    <bug_when>2015-11-09 13:32:22 +0300</bug_when>
    <thetext>* It could be used also to import some ALT&apos;s documentation into Git repos which is being edited at http://altlinux.org

* As for me, I&apos;m going to use it to import the text of the GOST which is implemented by the LaTeX package in https://bugzilla.altlinux.org/show_bug.cgi?id=31414 from wikisource (https://ru.wikisource.org/wiki/%D0%93%D0%9E%D0%A1%D0%A2_7.32%E2%80%942001 ), where it is collaboratively maintained.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>153589</commentid>
    <comment_count>2</comment_count>
    <who name="Ivan Zakharyaschev">imz</who>
    <bug_when>2015-11-09 22:14:26 +0300</bug_when>
    <thetext>BTW, when I try to use it, there are some problems.

I can&apos;t post an issue to the project at github, probably because it is a fork. Though it&apos;s the fork where I should post it to, because it looks Python-related.

Here are the errors I get (and the last run is successful -- with
English Wikipedia; perhaps, my default is Russian because of the
locale).

As for now, I have no ideas as to whether something can be fixed in this program
or in my environment.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>153590</commentid>
    <comment_count>3</comment_count>
    <who name="Ivan Zakharyaschev">imz</who>
    <bug_when>2015-11-09 22:15:26 +0300</bug_when>
    <thetext>$ wp2git.py --help
usage: wp2git.py [-h] [-n] [-o OUT] [--lang LANG | --site SITE] article_name

Create a git repository with the history of the specified Wikipedia article.

positional arguments:
  article_name

optional arguments:
  -h, --help         show this help message and exit
  -n, --no-import    Don&apos;t invoke git fast-import; only generate fast-import data stream
  -o OUT, --out OUT  Output directory or fast-import stream file
  --lang LANG        Wikipedia language code (default ru)
  --site SITE        Alternate site (e.g. http://commons.wikimedia.org[/w/])
$ wp2git.py --site https://ru.wikisource.org &apos;ГОСТ 7.32—2001&apos;
Connected to https://ru.wikisource.org/w/
Traceback (most recent call last):
  File &quot;/home/imz/bin/wp2git.py&quot;, line 110, in &lt;module&gt;
    main()
  File &quot;/home/imz/bin/wp2git.py&quot;, line 63, in main
    page = site.pages[args.article_name]
  File &quot;/usr/lib64/python2.7/site-packages/mwclient/listing.py&quot;, line 156, in __getitem__
    return self.get(name, None)
  File &quot;/usr/lib64/python2.7/site-packages/mwclient/listing.py&quot;, line 166, in get
    namespace = self.guess_namespace(name)
  File &quot;/usr/lib64/python2.7/site-packages/mwclient/listing.py&quot;, line 178, in guess_namespace
    if name.startswith(u&apos;%s:&apos; % self.site.namespaces[ns].replace(&apos; &apos;, &apos;_&apos;)):
UnicodeDecodeError: &apos;ascii&apos; codec can&apos;t decode byte 0xd0 in position 0: ordinal not in range(128)
$ locale
LANG=ru_RU.utf8
LC_CTYPE=&quot;ru_RU.utf8&quot;
LC_NUMERIC=&quot;ru_RU.utf8&quot;
LC_TIME=&quot;ru_RU.utf8&quot;
LC_COLLATE=&quot;ru_RU.utf8&quot;
LC_MONETARY=&quot;ru_RU.utf8&quot;
LC_MESSAGES=POSIX
LC_PAPER=&quot;ru_RU.utf8&quot;
LC_NAME=&quot;ru_RU.utf8&quot;
LC_ADDRESS=&quot;ru_RU.utf8&quot;
LC_TELEPHONE=&quot;ru_RU.utf8&quot;
LC_MEASUREMENT=&quot;ru_RU.utf8&quot;
LC_IDENTIFICATION=&quot;ru_RU.utf8&quot;
LC_ALL=
$ wp2git.py --site http://ru.wikisource.org &apos;ГОСТ 7.32—2001&apos;
Connected to http://ru.wikisource.org/w/
Traceback (most recent call last):
  File &quot;/home/imz/bin/wp2git.py&quot;, line 110, in &lt;module&gt;
    main()
  File &quot;/home/imz/bin/wp2git.py&quot;, line 63, in main
    page = site.pages[args.article_name]
  File &quot;/usr/lib64/python2.7/site-packages/mwclient/listing.py&quot;, line 156, in __getitem__
    return self.get(name, None)
  File &quot;/usr/lib64/python2.7/site-packages/mwclient/listing.py&quot;, line 166, in get
    namespace = self.guess_namespace(name)
  File &quot;/usr/lib64/python2.7/site-packages/mwclient/listing.py&quot;, line 178, in guess_namespace
    if name.startswith(u&apos;%s:&apos; % self.site.namespaces[ns].replace(&apos; &apos;, &apos;_&apos;)):
UnicodeDecodeError: &apos;ascii&apos; codec can&apos;t decode byte 0xd0 in position 0: ordinal not in range(128)
$ wp2git.py Bear
Connected to http://ru.wikipedia.org/w/
Traceback (most recent call last):
  File &quot;/home/imz/bin/wp2git.py&quot;, line 110, in &lt;module&gt;
    main()
  File &quot;/home/imz/bin/wp2git.py&quot;, line 65, in main
    p.error(&apos;Page %s does not exist&apos; % s)
NameError: global name &apos;s&apos; is not defined
$ wp2git.py Медведь
Connected to http://ru.wikipedia.org/w/
Traceback (most recent call last):
  File &quot;/home/imz/bin/wp2git.py&quot;, line 110, in &lt;module&gt;
    main()
  File &quot;/home/imz/bin/wp2git.py&quot;, line 63, in main
    page = site.pages[args.article_name]
  File &quot;/usr/lib64/python2.7/site-packages/mwclient/listing.py&quot;, line 156, in __getitem__
    return self.get(name, None)
  File &quot;/usr/lib64/python2.7/site-packages/mwclient/listing.py&quot;, line 166, in get
    namespace = self.guess_namespace(name)
  File &quot;/usr/lib64/python2.7/site-packages/mwclient/listing.py&quot;, line 178, in guess_namespace
    if name.startswith(u&apos;%s:&apos; % self.site.namespaces[ns].replace(&apos; &apos;, &apos;_&apos;)):
UnicodeDecodeError: &apos;ascii&apos; codec can&apos;t decode byte 0xd0 in position 0: ordinal not in range(128)
$ wp2git.py --lang en Bear
Connected to http://en.wikipedia.org/w/
Initialized empty Git repository in /home/imz/tests/test-wp2git/Bear/
 &gt;&gt; Revision 239584 by TimShell at Wed Oct 10 21:50:27 2001: *
 &gt;&gt; Revision 346214979 by Alan Millar at Wed Oct 10 22:43:35 2001: Fixing panda back to giant panda
 &gt;&gt; Revision 50758 by Conversion script at Mon Feb 25 15:43:11 2002: Automated conversion
 &gt;&gt; Revision 87603 by Mirwin at Thu Apr 11 20:33:54 2002: Added grizzly bear to list
 &gt;&gt; Revision 88030 by 24.53.240.203 at Fri Jun  7 12:00:50 2002: *
 &gt;&gt; Revision 112194 by Stephen Gilbert at Fri Jun  7 17:39:15 2002: removing dictionary.com link
 &gt;&gt; Revision 132079 by PierreAbbat at Sun Jul  7 08:40:44 2002: restore accidentally deleted end of sentence
 &gt;&gt; Revision 192849 by Andre Engels at Wed Jul 31 07:41:08 2002: de-orphanizing an image
 &gt;&gt; Revision 227848 by Montrealais at Tue Sep  3 11:17:32 2002: 
 &gt;&gt; Revision 227861 by 203.48.160.12 at Wed Sep 18 23:44:43 2002: 
 &gt;&gt; Revision 398194 by Mav at Wed Sep 18 23:56:46 2002: REVERT from VANDALISM by 203.48.160.12
 &gt;&gt; Revision 398212 by Fred Bauder at Fri Nov  1 17:01:16 2002: further reading
 &gt;&gt; Revision 590676 by Stormwriter at Fri Nov  1 17:07:50 2002: 
 &gt;&gt; Revision 590687 by Karen Johnson at Thu Jan 16 11:07:36 2003: I&apos;m not sure which type of bear this is, but uploading a pic I took
 &gt;&gt; Revision 590917 by MartinHarper at Thu Jan 16 11:24:04 2003: link [[bear market]]
 &gt;&gt; Revision 626017 by Robert Merkel at Thu Jan 16 13:28:02 2003: link to koala (mention it&apos;s *not* a bear
 &gt;&gt; Revision 626559 by Sannse at Tue Jan 28 12:39:35 2003: [[American]] -&gt; [[United States|American]]
 &gt;&gt; Revision 629029 by 207.213.160.63 at Tue Jan 28 18:52:45 2003: 
 &gt;&gt; Revision 629038 by Bronco~enwiki at Wed Jan 29 18:50:37 2003: Our fifth graders have finished for the time being.
 &gt;&gt; Revision 629079 by Bronco~enwiki at Wed Jan 29 18:53:21 2003: Done?
 &gt;&gt; Revision 659547 by Fred Bauder at Wed Jan 29 19:14:46 2003: removed information about authors of the article
 &gt;&gt; Revision 660956 by Alan Peakall at Tue Feb 11 12:53:34 2003: Copy edit and rationalised links to the Panda articles
 &gt;&gt; Revision 735916 by Ahoerstemeier at Tue Feb 11 22:09:12 2003: cave bear
 &gt;&gt; Revision 748674 by Montrealais at Mon Mar 10 01:37:20 2003: 
 &gt;&gt; Revision 769708 by Kricxjo at Sat Mar 15 09:44:21 2003: eo:
 &gt;&gt; Revision 816500 by Fred Bauder at Sun Mar 23 12:31:46 2003: re use
 &gt;&gt; Revision 930991 by ArnoLagrange at Thu Apr 10 08:00:09 2003: de
 &gt;&gt; Revision 931028 by Tannin at Sat May 17 19:38:29 2003: 
 &gt;&gt; Revision 988458 by Tannin at Sat May 17 19:48:11 2003: 
 &gt;&gt; Revision 988462 by Eclecticology at Mon Jun  2 04:11:48 2003: fixing capitalization
 &gt;&gt; Revision 988465 by Eclecticology at Mon Jun  2 04:12:28 2003: 
 &gt;&gt; Revision 988504 by Tannin at Mon Jun  2 04:12:56 2003: revert to correct case
 &gt;&gt; Revision 988507 by Eclecticology at Mon Jun  2 04:25:27 2003: revert to correct capitalization
 &gt;&gt; Revision 988932 by Tannin at Mon Jun  2 04:26:22 2003: revert
 &gt;&gt; Revision 988936 by Eclecticology at Mon Jun  2 08:13:45 2003: revert
 &gt;&gt; Revision 1015549 by Tannin at Mon Jun  2 08:14:53 2003: revert to correct version
 &gt;&gt; Revision 1122344 by &amp;#178;&amp;#185;&amp;#178; at Mon Jun  9 12:42:36 2003: 
 &gt;&gt; Revision 1122374 by TeunSpaans at Mon Jul  7 12:34:28 2003: +nl
 &gt;&gt; Revision 1122394 by Andre Engels at Mon Jul  7 12:46:16 2003: merged Ursidae in here
 &gt;&gt; Revision 1122415 by Andre Engels at Mon Jul  7 12:55:18 2003: 
 &gt;&gt; Revision 1122422 by Jimfbleak at Mon Jul  7 13:11:09 2003: treid to make text more grown-up
 &gt;&gt; Revision 1152613 by Rmhermen at Mon Jul  7 13:16:15 2003: typos
 &gt;&gt; Revision 1152615 by Andre Engels at Tue Jul 15 17:32:46 2003: made images wrap-around
 &gt;&gt; Revision 1160907 by Andre Engels at Tue Jul 15 17:33:18 2003: 
 &gt;&gt; Revision 1320504 by Baldhur at Thu Jul 17 17:16:36 2003: + taxobox, standardising classification
 &gt;&gt; Revision 1320681 by 81.203.98.109 at Wed Aug 20 20:46:26 2003: 
 &gt;&gt; Revision 1406742 by Rmhermen at Wed Aug 20 21:24:43 2003: 
 &gt;&gt; Revision 1406748 by 62.64.204.83 at Sun Sep  7 20:43:36 2003: 
 &gt;&gt; Revision 1411109 by 62.64.204.83 at Sun Sep  7 20:44:49 2003: 
 &gt;&gt; Revision 1411169 by Rmhermen at Mon Sep  8 18:51:45 2003: 
$</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>153595</commentid>
    <comment_count>4</comment_count>
    <who name="Ivan Zakharyaschev">imz</who>
    <bug_when>2015-11-10 01:48:17 +0300</bug_when>
    <thetext>That error happens with python-module-mwclient-0.6.5-alt1.1 from t7. Sisyphus has a newer version. I shall try that one.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>153596</commentid>
    <comment_count>5</comment_count>
    <who name="Ivan Zakharyaschev">imz</who>
    <bug_when>2015-11-10 01:58:43 +0300</bug_when>
    <thetext>No, the same error happens with python-module-mwclient-0.7-alt1.dev.git20140622</thetext>
  </long_desc>
      
      

    </bug>

</bugzilla>