kuroの覚え書き

96の個人的覚え書き

教師なし機械学習

deep learningの本を読めば読むほどに自分のやりたいことが教師あり深層学習ではできないんじゃないかと思い始めた。結局正解、不正解が明確に判定できる例が多数ないと、その特徴を抽出する学習が機能しないわけだが、世の中そんなに白黒はっきりした例が揃っていることなんてなかなかない。よくわからない集団の中から特徴を見分けて分類できるか、ということのほうが多いはずだ。

ということですっかり深層学習一辺倒になりつつある時代に逆行して、教師なし機械学習に戻ってくるという感じ。

早速書籍を購入し、ハンズオンを試していこうと思う。

さて心機一転Macに環境構築していく。

Pythonはすでに3.6がインストールされているので、tensorflowとkerasをpipでインストールする。
次にxgboostをインストールしようとしてちょっと詰まる。
書籍ではgitでダウンロードしてきたファイルからインストールしているがwin環境用のインストーラなのでインストールできない。

$ cd handson-unsupervised-learning/
$ cd xgboost/
$ pip3 install xgboost-0.81-cp36-cp36m-win_amd64.whl 
ERROR: xgboost-0.81-cp36-cp36m-win_amd64.whl is not a supported wheel on this platform.

よく見ると書籍の欄外に注釈として
pip install xgbootでインストールできると書いてあったので

$ pip3 install xgboost
Collecting xgboost
  Downloading xgboost-1.0.2.tar.gz (821 kB)
     |████████████████████████████████| 821 kB 5.0 MB/s 
    ERROR: Command errored out with exit status 1:
     command: /usr/local/opt/python/bin/python3.6 -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/private/var/folders/bh/ggq7fvb9581379cgprt5p9r40000gn/T/pip-install-4x26emtu/xgboost/setup.py'"'"'; __file__='"'"'/private/var/folders/bh/ggq7fvb9581379cgprt5p9r40000gn/T/pip-install-4x26emtu/xgboost/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' egg_info --egg-base /private/var/folders/bh/ggq7fvb9581379cgprt5p9r40000gn/T/pip-pip-egg-info-s7yrttks
         cwd: /private/var/folders/bh/ggq7fvb9581379cgprt5p9r40000gn/T/pip-install-4x26emtu/xgboost/
    Complete output (27 lines):
    ++ pwd
    + oldpath=/private/var/folders/bh/ggq7fvb9581379cgprt5p9r40000gn/T/pip-install-4x26emtu/xgboost
    + cd ./xgboost/
    + mkdir -p build
    + cd build
    + cmake ..
    ./xgboost/build-python.sh: line 21: cmake: command not found
    + echo -----------------------------
    -----------------------------
    + echo 'Building multi-thread xgboost failed'
    Building multi-thread xgboost failed
    + echo 'Start to build single-thread xgboost'
    Start to build single-thread xgboost
    + cmake .. -DUSE_OPENMP=0
    ./xgboost/build-python.sh: line 27: cmake: command not found
    Traceback (most recent call last):
      File "<string>", line 1, in <module>
      File "/private/var/folders/bh/ggq7fvb9581379cgprt5p9r40000gn/T/pip-install-4x26emtu/xgboost/setup.py", line 42, in <module>
        LIB_PATH = libpath['find_lib_path']()
      File "/private/var/folders/bh/ggq7fvb9581379cgprt5p9r40000gn/T/pip-install-4x26emtu/xgboost/xgboost/libpath.py", line 50, in find_lib_path
        'List of candidates:\n' + ('\n'.join(dll_path)))
    XGBoostLibraryNotFound: Cannot find XGBoost Library in the candidate path, did you install compilers and run build.sh in root path?
    List of candidates:
    /private/var/folders/bh/ggq7fvb9581379cgprt5p9r40000gn/T/pip-install-4x26emtu/xgboost/xgboost/libxgboost.dylib
    /private/var/folders/bh/ggq7fvb9581379cgprt5p9r40000gn/T/pip-install-4x26emtu/xgboost/xgboost/../../lib/libxgboost.dylib
    /private/var/folders/bh/ggq7fvb9581379cgprt5p9r40000gn/T/pip-install-4x26emtu/xgboost/xgboost/./lib/libxgboost.dylib
    /usr/local/Cellar/python/3.6.5/Frameworks/Python.framework/Versions/3.6/xgboost/libxgboost.dylib
    ----------------------------------------
ERROR: Command errored out with exit status 1: python setup.py egg_info Check the logs for full command output.

これはコンパイラがまずいのか?

$ gcc -dumpversion
4.2.1

gccの最新バージョンは8らしいのだが、4と5で互換性がないとかで、とりあえず6入れてみようかなと思い、

$ brew install gcc6
Updating Homebrew...
==> Downloading https://homebrew.bintray.com/bottles-portable-ruby/portable-ruby-2.6.3.mavericks.bottle.tar.gz
######################################################################## 100.0%
==> Pouring portable-ruby-2.6.3.mavericks.bottle.tar.gz
==> Auto-updated Homebrew!
Updated 3 taps (brewsci/science, homebrew/cask and homebrew/core).

.........

Warning: You are using macOS 10.12.
We (and Apple) do not provide support for this old version.
You will encounter build failures with some formulae.
Please create pull requests instead of asking for help on Homebrew's GitHub,
Discourse, Twitter or IRC. You are responsible for resolving any issues you
experience while you are running this old version.

..........

ん?6は古いから保証しない?やはり素直に8を入れるべきなのか。
いや、OSXのバージョンが古いのか?

長い時間がかかってようやくgcc6が入った。
しかし、まだエラーは消えず。
cmakeが入ってないせいかもしれない。
というわけで

$ brew install make
Updating Homebrew...
==> Auto-updated Homebrew!
Updated 2 taps (homebrew/cask and homebrew/core).
==> Updated Formulae
re2
==> Updated Casks
keka                                     mactracker

Warning: You are using macOS 10.12.
We (and Apple) do not provide support for this old version.
You will encounter build failures with some formulae.
Please create pull requests instead of asking for help on Homebrew's GitHub,
Discourse, Twitter or IRC. You are responsible for resolving any issues you
experience while you are running this old version.

..........

これまた長い時間がかかる。
しかしまだエラーが出てインストールできない。

$ pip3 install xgboost
WARNING: pip is configured with locations that require TLS/SSL, however the ssl module in Python is not available.
WARNING: Retrying (Retry(total=4, connect=None, read=None, redirect=None, status=None)) after connection broken by 'SSLError("Can't connect to HTTPS URL because the SSL module is not available.",)': /simple/xgboost/
WARNING: Retrying (Retry(total=3, connect=None, read=None, redirect=None, status=None)) after connection broken by 'SSLError("Can't connect to HTTPS URL because the SSL module is not available.",)': /simple/xgboost/
WARNING: Retrying (Retry(total=2, connect=None, read=None, redirect=None, status=None)) after connection broken by 'SSLError("Can't connect to HTTPS URL because the SSL module is not available.",)': /simple/xgboost/
WARNING: Retrying (Retry(total=1, connect=None, read=None, redirect=None, status=None)) after connection broken by 'SSLError("Can't connect to HTTPS URL because the SSL module is not available.",)': /simple/xgboost/
WARNING: Retrying (Retry(total=0, connect=None, read=None, redirect=None, status=None)) after connection broken by 'SSLError("Can't connect to HTTPS URL because the SSL module is not available.",)': /simple/xgboost/
Could not fetch URL https://pypi.org/simple/xgboost/: There was a problem confirming the ssl certificate: HTTPSConnectionPool(host='pypi.org', port=443): Max retries exceeded with url: /simple/xgboost/ (Caused by SSLError("Can't connect to HTTPS URL because the SSL module is not available.",)) - skipping
ERROR: Could not find a version that satisfies the requirement xgboost (from versions: none)
ERROR: No matching distribution found for xgboost

次に試したのは
Installation Guide — xgboost 1.1.0-SNAPSHOT documentation
ここを参照して(というか最初にここ見ろよという話)

$ brew install libomp
Updating Homebrew...
==> Auto-updated Homebrew!
Updated 1 tap (homebrew/core).
==> Updated Formulae
balena-cli          emscripten          gatsby-cli          neomutt
cfn-lint            faudio              git-annex           proj
conserver           fluent-bit          git-quick-stats
dxpy                fwup                gopass

Warning: You are using macOS 10.12.
We (and Apple) do not provide support for this old version.
You will encounter build failures with some formulae.
Please create pull requests instead of asking for help on Homebrew's GitHub,
Discourse, Twitter or IRC. You are responsible for resolving any issues you
experience while you are running this old version.

==> Downloading https://github.com/llvm/llvm-project/releases/download/llvmorg-1
==> Downloading from https://github-production-release-asset-2e65be.s3.amazonaws
######################################################################## 100.0%
==> cmake . -DLIBOMP_INSTALL_ALIASES=OFF
==> make install
==> cmake . -DLIBOMP_ENABLE_SHARED=OFF -DLIBOMP_INSTALL_ALIASES=OFF
==> make install
🍺  /usr/local/Cellar/libomp/10.0.0: 9 files, 1.4MB, built in 45 seconds

やっとインストールできそう。

$ cd ../python-package/
Mac-mini-2014:python-package kkuro$ python3 setup.py install
/usr/local/lib/python3.6/site-packages/setuptools/dist.py:472: UserWarning: The version specified ('1.1.0-SNAPSHOT') is an invalid version, this may not work as expected with newer versions of setuptools, pip, and PyPI. Please see PEP 440 for more details.
  "details." % version
running install
running build

..........

今度はエラーなく完了したようだけど果たしてこれでいいのか?