python开发环境问题记录


pip install xxx失败

使用Anaconda3管理虚拟环境,在指定虚拟环境中安装包失败,提示执行pip install –upgrade pip,但是升级失败提示:
Cannot open D:\Program Files\Anaconda3\Scripts\pip-script.py

可使用以下命令升级到指定版本:easy_install -U pip==21.0.1

tesserocr安装失败

Collecting tesserocr==2.4.0
  Using cached tesserocr-2.4.0.tar.gz (54 kB)

DEPRECATION: The -b/--build/--build-dir/--build-directory option is deprecated and has no effect anymore. pip 21.1 will remove support for this functionality. A possible replacement is use the TMPDIR/TEMP/TMP environment variable, possibly combined with --no-clean. You can find discussion regarding this at https://github.com/pypa/pip/issues/8333.
    ERROR: Command errored out with exit status 1:
     command: 'D:\tools\Anaconda3\envs\spider_secondhand_vehicle\Scripts\python.exe' -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'C:\\Users\\user\\AppData\\Local\\Temp\\pip-install-mja592cn\\tesserocr_eeb004ee535f424cad8c0240132c285f\\setup.py'"'"'; __file__='"'"'C:\\Users\\user\\AppData\\Local\\Temp\\pip-install-mja592cn\\tesserocr_eeb004ee535f424cad8c0240132c285f\\setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' egg_info --egg-base 'C:\Users\user\AppData\Local\Temp\pip-pip-egg-info-wvb43fvc'
         cwd: C:\Users\user\AppData\Local\Temp\pip-install-mja592cn\tesserocr_eeb004ee535f424cad8c0240132c285f\
    Complete output (167 lines):
    Failed to extract tesseract version from executable: [WinError 2] 系统找不到指定的文件。
    Supporting tesseract v3.04.00
    Building with configs: {'libraries': ['tesseract', 'lept'], 'cython_compile_time_env': {'TESSERACT_VERSION': 50593792}}
    Unable to find pgen, not compiling formal grammar.
    D:\tools\Anaconda3\lib\distutils\dist.py:261: UserWarning: Unknown distribution option: 'python_requires'
      warnings.warn(msg)
    Compiling C:\Users\user\AppData\Local\Temp\easy_install-jl6cct0z\Cython-3.0a6\Cython\Plex\Scanners.py because it changed.
    Compiling C:\Users\user\AppData\Local\Temp\easy_install-jl6cct0z\Cython-3.0a6\Cython\Plex\Actions.py because it changed.
    Compiling C:\Users\user\AppData\Local\Temp\easy_install-jl6cct0z\Cython-3.0a6\Cython\Plex\Machines.py because it changed.
    Compiling C:\Users\user\AppData\Local\Temp\easy_install-jl6cct0z\Cython-3.0a6\Cython\Plex\Transitions.py because it changed.
    Compiling C:\Users\user\AppData\Local\Temp\easy_install-jl6cct0z\Cython-3.0a6\Cython\Plex\DFA.py because it changed.
    Compiling C:\Users\user\AppData\Local\Temp\easy_install-jl6cct0z\Cython-3.0a6\Cython\Compiler\Scanning.py because it changed.
    Compiling C:\Users\user\AppData\Local\Temp\easy_install-jl6cct0z\Cython-3.0a6\Cython\Compiler\Visitor.py because it changed.
    Compiling C:\Users\user\AppData\Local\Temp\easy_install-jl6cct0z\Cython-3.0a6\Cython\Compiler\FlowControl.py because it changed.
    Compiling C:\Users\user\AppData\Local\Temp\easy_install-jl6cct0z\Cython-3.0a6\Cython\Runtime\refnanny.pyx because it changed.
    Compiling C:\Users\user\AppData\Local\Temp\easy_install-jl6cct0z\Cython-3.0a6\Cython\Compiler\FusedNode.py because it changed.
    Compiling C:\Users\user\AppData\Local\Temp\easy_install-jl6cct0z\Cython-3.0a6\Cython\Tempita\_tempita.py because it changed.
    [ 1/11] Cythonizing C:\Users\user\AppData\Local\Temp\easy_install-jl6cct0z\Cython-3.0a6\Cython\Compiler\FlowControl.py
    [ 2/11] Cythonizing C:\Users\user\AppData\Local\Temp\easy_install-jl6cct0z\Cython-3.0a6\Cython\Compiler\FusedNode.py
    [ 3/11] Cythonizing C:\Users\user\AppData\Local\Temp\easy_install-jl6cct0z\Cython-3.0a6\Cython\Compiler\Scanning.py
    [ 4/11] Cythonizing C:\Users\user\AppData\Local\Temp\easy_install-jl6cct0z\Cython-3.0a6\Cython\Compiler\Visitor.py
    [ 5/11] Cythonizing C:\Users\user\AppData\Local\Temp\easy_install-jl6cct0z\Cython-3.0a6\Cython\Plex\Actions.py
    [ 6/11] Cythonizing C:\Users\user\AppData\Local\Temp\easy_install-jl6cct0z\Cython-3.0a6\Cython\Plex\DFA.py
    [ 7/11] Cythonizing C:\Users\user\AppData\Local\Temp\easy_install-jl6cct0z\Cython-3.0a6\Cython\Plex\Machines.py
    [ 8/11] Cythonizing C:\Users\user\AppData\Local\Temp\easy_install-jl6cct0z\Cython-3.0a6\Cython\Plex\Scanners.py
    [ 9/11] Cythonizing C:\Users\user\AppData\Local\Temp\easy_install-jl6cct0z\Cython-3.0a6\Cython\Plex\Transitions.py
    [10/11] Cythonizing C:\Users\user\AppData\Local\Temp\easy_install-jl6cct0z\Cython-3.0a6\Cython\Runtime\refnanny.pyx
    [11/11] Cythonizing C:\Users\user\AppData\Local\Temp\easy_install-jl6cct0z\Cython-3.0a6\Cython\Tempita\_tempita.py
    warning: no files found matching 'Doc\*'
    warning: no files found matching '*.pyx' under directory 'Cython\Debugger\Tests'
    warning: no files found matching '*.pxd' under directory 'Cython\Debugger\Tests'
    warning: no files found matching '*.pxd' under directory 'Cython\Utility'
    warning: no files found matching 'pyximport\README'
    warning: build_py: byte-compiling is disabled, skipping.
    
    Traceback (most recent call last):
      File "D:\tools\Anaconda3\lib\distutils\core.py", line 148, in setup
        dist.run_commands()
      File "D:\tools\Anaconda3\lib\distutils\dist.py", line 955, in run_commands
        self.run_command(cmd)
      File "D:\tools\Anaconda3\lib\distutils\dist.py", line 974, in run_command
        cmd_obj.run()
      File "D:\tools\Anaconda3\envs\spider_secondhand_vehicle\lib\site-packages\setuptools\command\bdist_egg.py", line 161, in run
        cmd = self.call_command('install_lib', warn_dir=0)
      File "D:\tools\Anaconda3\envs\spider_secondhand_vehicle\lib\site-packages\setuptools\command\bdist_egg.py", line 147, in call_command
        self.run_command(cmdname)
      File "D:\tools\Anaconda3\lib\distutils\cmd.py", line 313, in run_command
        self.distribution.run_command(command)
      File "D:\tools\Anaconda3\lib\distutils\dist.py", line 974, in run_command
        cmd_obj.run()
      File "D:\tools\Anaconda3\envs\spider_secondhand_vehicle\lib\site-packages\setuptools\command\install_lib.py", line 10, in run
        self.build()
      File "D:\tools\Anaconda3\lib\distutils\command\install_lib.py", line 107, in build
        self.run_command('build_ext')
      File "D:\tools\Anaconda3\lib\distutils\cmd.py", line 313, in run_command
        self.distribution.run_command(command)
      File "D:\tools\Anaconda3\lib\distutils\dist.py", line 974, in run_command
        cmd_obj.run()
      File "D:\tools\Anaconda3\envs\spider_secondhand_vehicle\lib\site-packages\setuptools\command\build_ext.py", line 49, in run
        _build_ext.run(self)
      File "D:\tools\Anaconda3\lib\distutils\command\build_ext.py", line 339, in run
        self.build_extensions()
      File "D:\tools\Anaconda3\lib\distutils\command\build_ext.py", line 448, in build_extensions
        self._build_extensions_serial()
      File "D:\tools\Anaconda3\lib\distutils\command\build_ext.py", line 473, in _build_extensions_serial
        self.build_extension(ext)
      File "D:\tools\Anaconda3\envs\spider_secondhand_vehicle\lib\site-packages\setuptools\command\build_ext.py", line 174, in build_extension
        _build_ext.build_extension(self, ext)
      File "D:\tools\Anaconda3\lib\distutils\command\build_ext.py", line 533, in build_extension
        depends=ext.depends)
      File "D:\tools\Anaconda3\lib\distutils\_msvccompiler.py", line 345, in compile
        self.initialize()
      File "D:\tools\Anaconda3\lib\distutils\_msvccompiler.py", line 238, in initialize
        vc_env = _get_vc_env(plat_spec)
      File "D:\tools\Anaconda3\lib\distutils\_msvccompiler.py", line 134, in _get_vc_env
        raise DistutilsPlatformError("Unable to find vcvarsall.bat")
    distutils.errors.DistutilsPlatformError: Unable to find vcvarsall.bat
    
    During handling of the above exception, another exception occurred:
    
    Traceback (most recent call last):
      File "D:\tools\Anaconda3\envs\spider_secondhand_vehicle\lib\site-packages\setuptools\sandbox.py", line 154, in save_modules
        yield saved
      File "D:\tools\Anaconda3\envs\spider_secondhand_vehicle\lib\site-packages\setuptools\sandbox.py", line 195, in setup_context
        yield
      File "D:\tools\Anaconda3\envs\spider_secondhand_vehicle\lib\site-packages\setuptools\sandbox.py", line 243, in run_setup
        DirectorySandbox(setup_dir).run(runner)
      File "D:\tools\Anaconda3\envs\spider_secondhand_vehicle\lib\site-packages\setuptools\sandbox.py", line 273, in run
        return func()
      File "D:\tools\Anaconda3\envs\spider_secondhand_vehicle\lib\site-packages\setuptools\sandbox.py", line 242, in runner
        _execfile(setup_script, ns)
      File "D:\tools\Anaconda3\envs\spider_secondhand_vehicle\lib\site-packages\setuptools\sandbox.py", line 46, in _execfile
        exec(code, globals, locals)
      File "C:\Users\user\AppData\Local\Temp\easy_install-jl6cct0z\Cython-3.0a6\setup.py", line 299, in 
      File "C:\Users\user\AppData\Local\Temp\easy_install-jl6cct0z\Cython-3.0a6\setup.py", line 294, in run_build
      File "D:\tools\Anaconda3\lib\distutils\core.py", line 163, in setup
        raise SystemExit("error: " + str(msg))
    SystemExit: error: Unable to find vcvarsall.bat
    
    During handling of the above exception, another exception occurred:
    
    Traceback (most recent call last):
      File "D:\tools\Anaconda3\envs\spider_secondhand_vehicle\lib\site-packages\setuptools\command\easy_install.py", line 1064, in run_setup
        run_setup(setup_script, args)
      File "D:\tools\Anaconda3\envs\spider_secondhand_vehicle\lib\site-packages\setuptools\sandbox.py", line 246, in run_setup
        raise
      File "D:\tools\Anaconda3\lib\contextlib.py", line 99, in __exit__
        self.gen.throw(type, value, traceback)
      File "D:\tools\Anaconda3\envs\spider_secondhand_vehicle\lib\site-packages\setuptools\sandbox.py", line 195, in setup_context
        yield
      File "D:\tools\Anaconda3\lib\contextlib.py", line 99, in __exit__
        self.gen.throw(type, value, traceback)
      File "D:\tools\Anaconda3\envs\spider_secondhand_vehicle\lib\site-packages\setuptools\sandbox.py", line 166, in save_modules
        saved_exc.resume()
      File "D:\tools\Anaconda3\envs\spider_secondhand_vehicle\lib\site-packages\setuptools\sandbox.py", line 141, in resume
        six.reraise(type, exc, self._tb)
      File "D:\tools\Anaconda3\envs\spider_secondhand_vehicle\lib\site-packages\pkg_resources\_vendor\six.py", line 685, in reraise
        raise value.with_traceback(tb)
      File "D:\tools\Anaconda3\envs\spider_secondhand_vehicle\lib\site-packages\setuptools\sandbox.py", line 154, in save_modules
        yield saved
      File "D:\tools\Anaconda3\envs\spider_secondhand_vehicle\lib\site-packages\setuptools\sandbox.py", line 195, in setup_context
        yield
      File "D:\tools\Anaconda3\envs\spider_secondhand_vehicle\lib\site-packages\setuptools\sandbox.py", line 243, in run_setup
        DirectorySandbox(setup_dir).run(runner)
      File "D:\tools\Anaconda3\envs\spider_secondhand_vehicle\lib\site-packages\setuptools\sandbox.py", line 273, in run
        return func()
      File "D:\tools\Anaconda3\envs\spider_secondhand_vehicle\lib\site-packages\setuptools\sandbox.py", line 242, in runner
        _execfile(setup_script, ns)
      File "D:\tools\Anaconda3\envs\spider_secondhand_vehicle\lib\site-packages\setuptools\sandbox.py", line 46, in _execfile
        exec(code, globals, locals)
      File "C:\Users\user\AppData\Local\Temp\easy_install-jl6cct0z\Cython-3.0a6\setup.py", line 299, in 
      File "C:\Users\user\AppData\Local\Temp\easy_install-jl6cct0z\Cython-3.0a6\setup.py", line 294, in run_build
      File "D:\tools\Anaconda3\lib\distutils\core.py", line 163, in setup
        raise SystemExit("error: " + str(msg))
    SystemExit: error: Unable to find vcvarsall.bat
    
    During handling of the above exception, another exception occurred:
    
    Traceback (most recent call last):
      File "", line 1, in 
      File "C:\Users\user\AppData\Local\Temp\pip-install-mja592cn\tesserocr_eeb004ee535f424cad8c0240132c285f\setup.py", line 210, in 
        setup_requires=['Cython>=0.23'],
      File "D:\tools\Anaconda3\lib\distutils\core.py", line 108, in setup
        _setup_distribution = dist = klass(attrs)
      File "D:\tools\Anaconda3\envs\spider_secondhand_vehicle\lib\site-packages\setuptools\dist.py", line 269, in __init__
        self.fetch_build_eggs(attrs['setup_requires'])
      File "D:\tools\Anaconda3\envs\spider_secondhand_vehicle\lib\site-packages\setuptools\dist.py", line 313, in fetch_build_eggs
        replace_conflicting=True,
      File "D:\tools\Anaconda3\envs\spider_secondhand_vehicle\lib\site-packages\pkg_resources\__init__.py", line 826, in resolve
        dist = best[req.key] = env.best_match(req, ws, installer)
      File "D:\tools\Anaconda3\envs\spider_secondhand_vehicle\lib\site-packages\pkg_resources\__init__.py", line 1092, in best_match
        return self.obtain(req, installer)
      File "D:\tools\Anaconda3\envs\spider_secondhand_vehicle\lib\site-packages\pkg_resources\__init__.py", line 1104, in obtain
        return installer(requirement)
      File "D:\tools\Anaconda3\envs\spider_secondhand_vehicle\lib\site-packages\setuptools\dist.py", line 380, in fetch_build_egg
        return cmd.easy_install(req)
      File "D:\tools\Anaconda3\envs\spider_secondhand_vehicle\lib\site-packages\setuptools\command\easy_install.py", line 640, in easy_install
        return self.install_item(spec, dist.location, tmpdir, deps)
      File "D:\tools\Anaconda3\envs\spider_secondhand_vehicle\lib\site-packages\setuptools\command\easy_install.py", line 670, in install_item
        dists = self.install_eggs(spec, download, tmpdir)
      File "D:\tools\Anaconda3\envs\spider_secondhand_vehicle\lib\site-packages\setuptools\command\easy_install.py", line 850, in install_eggs
        return self.build_and_install(setup_script, setup_base)
      File "D:\tools\Anaconda3\envs\spider_secondhand_vehicle\lib\site-packages\setuptools\command\easy_install.py", line 1078, in build_and_install
        self.run_setup(setup_script, setup_base, args)
      File "D:\tools\Anaconda3\envs\spider_secondhand_vehicle\lib\site-packages\setuptools\command\easy_install.py", line 1066, in run_setup
        raise DistutilsError("Setup script exited with %s" % (v.args[0],))
    distutils.errors.DistutilsError: Setup script exited with error: Unable to find vcvarsall.bat
    ----------------------------------------
WARNING: Discarding https://files.pythonhosted.org/packages/9b/98/b4a534c4f3da4163c8c3d4dfdb1619748b7fe7d8c4fc4718cad3cda55e32/tesserocr-2.4.0.tar.gz#sha256=b0a6f44044217f962541f3166c817023cf149d208cd5cb19cc46fc1032698731 (from https://pypi.org/simple/tesserocr/). Command errored out with exit status 1: python setup.py egg_info Check the logs for full command output.
ERROR: Could not find a version that satisfies the requirement tesserocr==2.4.0
ERROR: No matching distribution found for tesserocr==2.4.0

从以上日志可以看出缺少vcvarsall.bat,需要安装VS2010,但是太麻烦,网上查到以下方案:
进入下面此网站:https://github.com/simonflueckiger/tesserocr-windows_build/releases ,确认自己的电脑是64位还是32位,以及python版本,下载相应的whl文件,然后进入命令行输入pip install (whl文件路径)

tesserocr.image_to_text失败

虽然安装成功了,但是扫描获取图片中文本时,报如下错误:

RuntimeError: Failed to init API, possibly an invalid tessdata path:D:\tools\Anaconda3\envs\xxx\Scripts\/tessdata/

尝试手动创建tessdata目录也不行,怀疑是路径中\/的问题,但不是外部程序控制,不至于包有问题。网上提供以下解决方案:安装tesseract,把安装后的D:\xxx\Tesseract-OCR\tessdata目录下文件拷贝到上述报错目录下,按照上述方案执行后,图片解决正常,原理不太清楚

需要注意:

  • tesseract的版本要和python、tesserocr版本匹配,下载地址:https://digi.bib.uni-mannheim.de/tesseract/
  • image_to_text有一定的失败率

CentOS下安装tesserocr

  • 安装参考:https://www.pianshen.com/article/9026369424/
  • 安装常见问题:
    • https://blog.csdn.net/u014359108/article/details/108343787
    • https://blog.csdn.net/weixin_33878457/article/details/89836864
    • https://shipengliang.com/software-exp/error-in-pixreadmemtiff-function-not-present解决办法.html

上述环境都准备好后,pip install tesserocr执行错误如下:

    。。。
    In file included from tesserocr.cpp:848:0:
    /usr/local/include/tesseract/baseapi.h: At global scope:
    /usr/local/include/tesseract/baseapi.h:78:31: error: ‘UNICHAR_ID’ has not been declared
                                   UNICHAR_ID unichar_id, bool word_end) const;
                                   ^
    /usr/local/include/tesseract/baseapi.h: In member function ‘int tesseract::TessBaseAPI::Init(const char*, const char*, tesseract::OcrEngineMode)’:
    /usr/local/include/tesseract/baseapi.h:232:42: error: ‘nullptr’ was not declared in this scope
         return Init(datapath, language, oem, nullptr, 0, nullptr, nullptr, false);
                                              ^
    /usr/local/include/tesseract/baseapi.h: In member function ‘int tesseract::TessBaseAPI::Init(const char*, const char*)’:
    /usr/local/include/tesseract/baseapi.h:235:50: error: ‘nullptr’ was not declared in this scope
         return Init(datapath, language, OEM_DEFAULT, nullptr, 0, nullptr, nullptr, false);
                                                      ^
    /usr/local/include/tesseract/baseapi.h: In member function ‘Boxa* tesseract::TessBaseAPI::GetTextlines(Pixa**, int**)’:
    /usr/local/include/tesseract/baseapi.h:413:51: error: ‘nullptr’ was not declared in this scope
         return GetTextlines(false, 0, pixa, blockids, nullptr);
                                                       ^
    /usr/local/include/tesseract/baseapi.h: In member function ‘Boxa* tesseract::TessBaseAPI::GetComponentImages(tesseract::PageIteratorLevel, bool, Pixa**, int**)’:
    /usr/local/include/tesseract/baseapi.h:464:75: error: ‘nullptr’ was not declared in this scope
         return GetComponentImages(level, text_only, false, 0, pixa, blockids, nullptr);
                                                                               ^
    tesserocr.cpp: In function ‘int __pyx_pymod_exec_tesserocr(PyObject*)’:
    tesserocr.cpp:47999:70: error: ‘OEM_CUBE_ONLY’ is not a member of ‘tesseract’
       __pyx_t_10 = __Pyx_PyInt_From_enum__tesseract_3a__3a_OcrEngineMode(tesseract::OEM_CUBE_ONLY); if (unlikely(!__pyx_t_10)) __PYX_ERR(0, 91, __pyx_L1_error)
                                                                          ^
    tesserocr.cpp:48012:70: error: ‘OEM_TESSERACT_CUBE_COMBINED’ is not a member of ‘tesseract’
       __pyx_t_10 = __Pyx_PyInt_From_enum__tesseract_3a__3a_OcrEngineMode(tesseract::OEM_TESSERACT_CUBE_COMBINED); if (unlikely(!__pyx_t_10)) __PYX_ERR(0, 92, __pyx_L1_error)
                                                                          ^
    error: command 'gcc' failed with exit status 1
    
    ----------------------------------------
Command "/usr/local/creeper-scrapyd/scrapyd/venv/bin/python3 -u -c "import setuptools, tokenize;__file__='/tmp/pip-install-98qbubb1/tesserocr/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" install --record /tmp/pip-record-_wi0cg6m/install-record.txt --single-version-externally-managed --compile --install-headers /usr/local/creeper-scrapyd/scrapyd/venv/include/site/python3.6/tesserocr" failed with error code 1 in /tmp/pip-install-98qbubb1/tesserocr/

猜测是两者版本不匹配,但是tesseract安装的是4.0.0版本,tesserocr对应安装2.4.0,在本地win7环境安装没问题,可以排除版本不匹配问题。

尝试安装pytesseract,执行命令pip install pytesseract,安装成功。pytesseract、tesserocr两者都是对tesseract的封装,暂时替换为pytesseract,最终原因待定

注意事项:需要切换到root账户下,在./configure时一直提示Leptonica 1.74 or higher is required. Try to install libleptonica-dev package,配置/etc/profile并source后还是提示以上错误,因为未切换root账户,source和configure不在一个用户下