Big Ben

一个半吊子的编码爱好者

0%

Edit

Overview

最近有一个项目想做chat bot,也就是在IM app上自动回答问题的机器人。要回答问题,就要对问题有所了解,于是最近看了一些NLP (Natural Language Process)方面的资料。没有过多的涉及基础理论以及模型方面的知识。因为是做项目,更多的是希望能从工程方面直接进行应用。本文主要涉及SyntaxNet和NLTK。前者是Google于2016年开源的NLP项目,其包含了基本模型以及基于TensorFlow的实现。而且其基于若干训练资料,已经有一个pre-trained English model。可以直接被加以利用。

SyntaxNet

安装

一些参考网页

安装完成后,运行一个demo程序如下:

echo 'Bob brought the pizza to Alice.' | syntaxnet/demo.sh

结果是一个树状结构

Input: Bob brought the pizza to Alice .
Parse:
brought VBD ROOT
+-- Bob NNP nsubj
+-- pizza NN dobj
| +-- the DT det
+-- to IN prep
| +-- Alice NNP pobj
+-- . . punct

SyntaxNet自带的pre-trained English parser叫Parsey McParseface。我们可以用这个parser来分析语句。根据How to Install and Use SyntaxNet and Parsey McParseface中所述,Parsey McParseface输出实为CoNLL table。这个table的格式在models/syntaxnet/syntaxnet/text_formats.cc,如下:

 50 // CoNLL document format reader for dependency annotated corpora.
51 // The expected format is described e.g. at http://ilk.uvt.nl/conll/#dataformat
52 //
53 // Data should adhere to the following rules:
54 // - Data files contain sentences separated by a blank line.
55 // - A sentence consists of one or tokens, each one starting on a new line.
56 // - A token consists of ten fields described in the table below.
57 // - Fields are separated by a single tab character.
58 // - All data files will contains these ten fields, although only the ID
59 // column is required to contain non-dummy (i.e. non-underscore) values.
60 // Data files should be UTF-8 encoded (Unicode).
61 //
62 // Fields:
63 // 1 ID: Token counter, starting at 1 for each new sentence and increasing
64 // by 1 for every new token.
65 // 2 FORM: Word form or punctuation symbol.
66 // 3 LEMMA: Lemma or stem.
67 // 4 CPOSTAG: Coarse-grained part-of-speech tag or category.
68 // 5 POSTAG: Fine-grained part-of-speech tag. Note that the same POS tag
69 // cannot appear with multiple coarse-grained POS tags.
70 // 6 FEATS: Unordered set of syntactic and/or morphological features.
71 // 7 HEAD: Head of the current token, which is either a value of ID or '0'.
72 // 8 DEPREL: Dependency relation to the HEAD.
73 // 9 PHEAD: Projective head of current token.
74 // 10 PDEPREL: Dependency relation to the PHEAD.

直接使用CoNLL table更易于被代码解析。如果需要CoNLL table输出,需要我们修改demo.sh。直接来个例子:

INFO:tensorflow:Processed 1 documents
1 What _ PRON WP _ 0 ROOT _ _
2 is _ VERB VBZ _ 1 cop _ _
3 a _ DET DT _ 5 det _ _
4 control _ NOUN NN _ 5 nn _ _
5 panel _ NOUN NN _ 1 nsubj _ _

这是对What is a control panel的输出。下图源于Inside Google SyntaxNet

CoNLL table中的所有tag缩写的含义在这里Universal Dependency Relations

针对CPOSTAG & POSTAG,可参考

NLTK

NLTK is a leading platform for building Python programs to work with human language data. It provides easy-to-use interfaces to over 50 corpora and lexical resources such as WordNet, along with a suite of text processing libraries for classification, tokenization, stemming, tagging, parsing, and semantic reasoning, wrappers for industrial-strength NLP libraries, and an active discussion forum.

我的想法是可以用NLTK来对SyntaxNet的输出做进一步Stemming/Lemmarization处理。

NLTK vs. spaCy

Stemming vs. Lemmatization

这两个词总是一起出现,作用很相像。区别是:

Lemmatisation is closely related to stemming. The difference is that a stemmer operates on a single word without knowledge of the context, and therefore cannot discriminate between words which have different meanings depending on part of speech. However, stemmers are typically easier to implement and run faster, and the reduced accuracy may not matter for some applications.

In computational linguistics, lemmatisation is the algorithmic process of determining the lemma for a given word. Since the process may involve complex tasks such as understanding context and determining the part of speech of a word in a sentence (requiring, for example, knowledge of the grammar of a language) it can be a hard task to implement a lemmatiser for a new language.

简而言之,就是Lemmarization是包含有上下文含义的,而Stemming只是对单个单词进行映射。

NLTK支持多种Stemmer,包括但不限于 Porter stemmer, Lancaster Stemmer, Snowball Stemmer。

>>> from nltk.stem import SnowballStemmer
>>> snowball_stemmer = SnowballStemmer(“english”)
>>> snowball_stemmer.stem(‘maximum’)
u’maximum’
>>> snowball_stemmer.stem(‘presumably’)
u’presum’
>>> snowball_stemmer.stem(‘multiply’)
u’multipli’

NLTK中的Lemarize:

>>> from nltk.stem import WordNetLemmatizer
>>> wordnet_lemmatizer = WordNetLemmatizer()
>>> wordnet_lemmatizer.lemmatize(‘dogs’)
u’dog’
>>> wordnet_lemmatizer.lemmatize(‘churches’)
u’church’
>>> wordnet_lemmatizer.lemmatize(‘is’, pos=’v’)
u’be’
>>> wordnet_lemmatizer.lemmatize(‘are’, pos=’v’)
u’be’
>>>
  • pos = Part Of Speech
%23NLP%u8BB0%u5F55%0A@%28myblog%29%5Bnlp%5D%0A%0A%5BTOC%5D%0A%0A%23%23Overview%0A%u6700%u8FD1%u6709%u4E00%u4E2A%u9879%u76EE%u60F3%u505Achat%20bot%uFF0C%u4E5F%u5C31%u662F%u5728IM%20app%u4E0A%u81EA%u52A8%u56DE%u7B54%u95EE%u9898%u7684%u673A%u5668%u4EBA%u3002%u8981%u56DE%u7B54%u95EE%u9898%uFF0C%u5C31%u8981%u5BF9%u95EE%u9898%u6709%u6240%u4E86%u89E3%uFF0C%u4E8E%u662F%u6700%u8FD1%u770B%u4E86%u4E00%u4E9BNLP%20%28Natural%20Language%20Process%29%u65B9%u9762%u7684%u8D44%u6599%u3002%u6CA1%u6709%u8FC7%u591A%u7684%u6D89%u53CA%u57FA%u7840%u7406%u8BBA%u4EE5%u53CA%u6A21%u578B%u65B9%u9762%u7684%u77E5%u8BC6%u3002%u56E0%u4E3A%u662F%u505A%u9879%u76EE%uFF0C%u66F4%u591A%u7684%u662F%u5E0C%u671B%u80FD%u4ECE%u5DE5%u7A0B%u65B9%u9762%u76F4%u63A5%u8FDB%u884C%u5E94%u7528%u3002%u672C%u6587%u4E3B%u8981%u6D89%u53CA%5BSyntaxNet%5D%28https%3A//github.com/tensorflow/models/tree/master/syntaxnet%29%u548CNLTK%u3002%u524D%u8005%u662FGoogle%u4E8E2016%u5E74%u5F00%u6E90%u7684NLP%u9879%u76EE%uFF0C%u5176%u5305%u542B%u4E86%u57FA%u672C%u6A21%u578B%u4EE5%u53CA%u57FA%u4E8ETensorFlow%u7684%u5B9E%u73B0%u3002%u800C%u4E14%u5176%u57FA%u4E8E%u82E5%u5E72%u8BAD%u7EC3%u8D44%u6599%uFF0C%u5DF2%u7ECF%u6709%u4E00%u4E2Apre-trained%20English%20model%u3002%u53EF%u4EE5%u76F4%u63A5%u88AB%u52A0%u4EE5%u5229%u7528%u3002%0A%0A%23%23%23SyntaxNet%0A%23%23%23%23%u5B89%u88C5%0A%u4E00%u4E9B%u53C2%u8003%u7F51%u9875%0A-%20%5BNLP%u521D%u7EA7%u9009%u624Bubuntu%20%u4E0B%u5B89%u88C5google%20SyntaxNet%5D%28%5D%28http%3A//blog.csdn.net/u012507864/article/details/51478060%29%29%0A-%20%5BHow%20to%20Install%20and%20Use%20SyntaxNet%20and%20Parsey%20McParseface%5D%28http%3A//www.whycouch.com/2016/07/how-to-install-and-use-syntaxnet-and.html%29%0A-%20%5BSyntaxNet%20Tutorial%5D%28https%3A//github.com/tensorflow/models/blob/master/syntaxnet/g3doc/syntaxnet-tutorial.md%29%0A-%20%5BSyntaxNet%3A%20Understanding%20the%20Parser%5D%28http%3A//jduelfer.github.io/syntaxnet%2C/tensorflow/2016/08/20/understanding-the-parser.html%29%0A%0A%u5B89%u88C5%u5B8C%u6210%u540E%uFF0C%u8FD0%u884C%u4E00%u4E2Ademo%u7A0B%u5E8F%u5982%u4E0B%uFF1A%0A%60%60%60%0Aecho%20%27Bob%20brought%20the%20pizza%20to%20Alice.%27%20%7C%20syntaxnet/demo.sh%0A%60%60%60%0A%u7ED3%u679C%u662F%u4E00%u4E2A%u6811%u72B6%u7ED3%u6784%0A%60%60%60%0AInput%3A%20Bob%20brought%20the%20pizza%20to%20Alice%20.%0AParse%3A%0Abrought%20VBD%20ROOT%0A%20+--%20Bob%20NNP%20nsubj%0A%20+--%20pizza%20NN%20dobj%0A%20%7C%20%20%20+--%20the%20DT%20det%0A%20+--%20to%20IN%20prep%0A%20%7C%20%20%20+--%20Alice%20NNP%20pobj%0A%20+--%20.%20.%20punct%0A%60%60%60%0ASyntaxNet%u81EA%u5E26%u7684pre-trained%20English%20parser%u53EBParsey%20McParseface%u3002%u6211%u4EEC%u53EF%u4EE5%u7528%u8FD9%u4E2Aparser%u6765%u5206%u6790%u8BED%u53E5%u3002%u6839%u636E%5BHow%20to%20Install%20and%20Use%20SyntaxNet%20and%20Parsey%20McParseface%5D%28http%3A//www.whycouch.com/2016/07/how-to-install-and-use-syntaxnet-and.html%29%u4E2D%u6240%u8FF0%uFF0CParsey%20McParseface%u8F93%u51FA%u5B9E%u4E3ACoNLL%20table%u3002%u8FD9%u4E2Atable%u7684%u683C%u5F0F%u5728%60models/syntaxnet/syntaxnet/text_formats.cc%60%uFF0C%u5982%u4E0B%uFF1A%0A%60%60%60%0A%2050%20//%20CoNLL%20document%20format%20reader%20for%20dependency%20annotated%20corpora.%0A%2051%20//%20The%20expected%20format%20is%20described%20e.g.%20at%20http%3A//ilk.uvt.nl/conll/%23dataformat%0A%2052%20//%0A%2053%20//%20Data%20should%20adhere%20to%20the%20following%20rules%3A%0A%2054%20//%20%20%20-%20Data%20files%20contain%20sentences%20separated%20by%20a%20blank%20line.%0A%2055%20//%20%20%20-%20A%20sentence%20consists%20of%20one%20or%20tokens%2C%20each%20one%20starting%20on%20a%20new%20line.%0A%2056%20//%20%20%20-%20A%20token%20consists%20of%20ten%20fields%20described%20in%20the%20table%20below.%0A%2057%20//%20%20%20-%20Fields%20are%20separated%20by%20a%20single%20tab%20character.%0A%2058%20//%20%20%20-%20All%20data%20files%20will%20contains%20these%20ten%20fields%2C%20although%20only%20the%20ID%0A%2059%20//%20%20%20%20%20column%20is%20required%20to%20contain%20non-dummy%20%28i.e.%20non-underscore%29%20values.%0A%2060%20//%20Data%20files%20should%20be%20UTF-8%20encoded%20%28Unicode%29.%0A%2061%20//%0A%2062%20//%20Fields%3A%0A%2063%20//%201%20%20ID%3A%20%20%20%20%20%20Token%20counter%2C%20starting%20at%201%20for%20each%20new%20sentence%20and%20increasing%0A%2064%20//%20%20%20%20%20%20%20%20%20%20%20%20%20by%201%20for%20every%20new%20token.%0A%2065%20//%202%20%20FORM%3A%20%20%20%20Word%20form%20or%20punctuation%20symbol.%0A%2066%20//%203%20%20LEMMA%3A%20%20%20Lemma%20or%20stem.%0A%2067%20//%204%20%20CPOSTAG%3A%20Coarse-grained%20part-of-speech%20tag%20or%20category.%0A%2068%20//%205%20%20POSTAG%3A%20%20Fine-grained%20part-of-speech%20tag.%20Note%20that%20the%20same%20POS%20tag%0A%2069%20//%20%20%20%20%20%20%20%20%20%20%20%20%20cannot%20appear%20with%20multiple%20coarse-grained%20POS%20tags.%0A%2070%20//%206%20%20FEATS%3A%20%20%20Unordered%20set%20of%20syntactic%20and/or%20morphological%20features.%0A%2071%20//%207%20%20HEAD%3A%20%20%20%20Head%20of%20the%20current%20token%2C%20which%20is%20either%20a%20value%20of%20ID%20or%20%270%27.%0A%2072%20//%208%20%20DEPREL%3A%20%20Dependency%20relation%20to%20the%20HEAD.%0A%2073%20//%209%20%20PHEAD%3A%20%20%20Projective%20head%20of%20current%20token.%0A%2074%20//%2010%20PDEPREL%3A%20Dependency%20relation%20to%20the%20PHEAD.%0A%60%60%60%0A%u76F4%u63A5%u4F7F%u7528CoNLL%20table%u66F4%u6613%u4E8E%u88AB%u4EE3%u7801%u89E3%u6790%u3002%u5982%u679C%u9700%u8981CoNLL%20table%u8F93%u51FA%uFF0C%u9700%u8981%u6211%u4EEC%u4FEE%u6539demo.sh%u3002%u76F4%u63A5%u6765%u4E2A%u4F8B%u5B50%uFF1A%0A%60%60%60%0AINFO%3Atensorflow%3AProcessed%201%20documents%0A1%20%20%20%20%20%20%20What%20%20%20%20_%20%20%20%20%20%20%20PRON%20%20%20%20WP%20%20%20%20%20%20_%20%20%20%20%20%20%200%20%20%20%20%20%20%20ROOT%20%20%20%20_%20%20%20%20%20%20%20_%0A2%20%20%20%20%20%20%20is%20%20%20%20%20%20_%20%20%20%20%20%20%20VERB%20%20%20%20VBZ%20%20%20%20%20_%20%20%20%20%20%20%201%20%20%20%20%20%20%20cop%20%20%20%20%20_%20%20%20%20%20%20%20_%0A3%20%20%20%20%20%20%20a%20%20%20%20%20%20%20_%20%20%20%20%20%20%20DET%20%20%20%20%20DT%20%20%20%20%20%20_%20%20%20%20%20%20%205%20%20%20%20%20%20%20det%20%20%20%20%20_%20%20%20%20%20%20%20_%0A4%20%20%20%20%20%20%20control%20_%20%20%20%20%20%20%20NOUN%20%20%20%20NN%20%20%20%20%20%20_%20%20%20%20%20%20%205%20%20%20%20%20%20%20nn%20%20%20%20%20%20_%20%20%20%20%20%20%20_%0A5%20%20%20%20%20%20%20panel%20%20%20_%20%20%20%20%20%20%20NOUN%20%20%20%20NN%20%20%20%20%20%20_%20%20%20%20%20%20%201%20%20%20%20%20%20%20nsubj%20%20%20_%20%20%20%20%20%20%20_%0A%60%60%60%0A%u8FD9%u662F%u5BF9What%20is%20a%20control%20panel%u7684%u8F93%u51FA%u3002%u4E0B%u56FE%u6E90%u4E8E%5BInside%20Google%20SyntaxNet%5D%28http%3A//andrewmatteson.name/index.php/2017/02/04/inside-syntaxnet/%29%0A%21%5BAlt%20text%5D%28./1499230365640.png%29%0A%u53EF%u89C1%u7B2C3%2C6%2C9%2C10%u5B57%u6BB5%uFF0CSytanxNet%u5E76%u6CA1%u6709%u8F93%u51FA%0A%0ACoNLL%20table%u4E2D%u7684%u6240%u6709tag%u7F29%u5199%u7684%u542B%u4E49%u5728%u8FD9%u91CC%5BUniversal%20Dependency%20Relations%5D%28http%3A//universaldependencies.org/u/dep/%29%0A%0A%u9488%u5BF9CPOSTAG%20%26%20POSTAG%uFF0C%u53EF%u53C2%u8003%0A-%20%5BAlphabetical%20list%20of%20part-of-speech%20tags%20used%20in%20the%20Penn%20Treebank%20Project%5D%28https%3A//www.ling.upenn.edu/courses/Fall_2003/ling001/penn_treebank_pos.html%29%0A-%20%5BWh-%20words%5D%28http%3A//www.ling.upenn.edu/histcorpora/annotation/pos-wh.htm%29%0A%0A%0A%0A%23%23%23NLTK%0A%3E**NLTK**%20is%20a%20leading%20platform%20for%20building%20**Python**%20programs%20to%20work%20with%20human%20language%20data.%20It%20provides%20easy-to-use%20interfaces%20to%20over%2050%20corpora%20and%20lexical%20resources%20such%20as%20WordNet%2C%20along%20with%20a%20suite%20of%20text%20processing%20libraries%20for%20classification%2C%20tokenization%2C%20stemming%2C%20tagging%2C%20parsing%2C%20and%20semantic%20reasoning%2C%20wrappers%20for%20industrial-strength%20NLP%20libraries%2C%20and%20an%20active%20discussion%20forum.%0A%0A%u6211%u7684%u60F3%u6CD5%u662F%u53EF%u4EE5%u7528NLTK%u6765%u5BF9SyntaxNet%u7684%u8F93%u51FA%u505A%u8FDB%u4E00%u6B65Stemming/Lemmarization%u5904%u7406%u3002%0A%0A%23%23%23%23%5BNLTK%20vs.%20spaCy%5D%28http%3A//blog.thedataincubator.com/2016/04/nltk-vs-spacy-natural-language-processing-in-python/%29%0A%0A%23%23%23%23Stemming%20vs.%20Lemmatization%0A%u8FD9%u4E24%u4E2A%u8BCD%u603B%u662F%u4E00%u8D77%u51FA%u73B0%uFF0C%u4F5C%u7528%u5F88%u76F8%u50CF%u3002%u533A%u522B%u662F%uFF1A%0A%3ELemmatisation%20is%20closely%20related%20to%20stemming.%20The%20difference%20is%20that%20**a%20stemmer%20operates%20on%20a%20single%20word%20without%20knowledge%20of%20the%20context**%2C%20and%20therefore%20cannot%20discriminate%20between%20words%20which%20have%20different%20meanings%20depending%20on%20part%20of%20speech.%20However%2C%20stemmers%20are%20typically%20easier%20to%20implement%20and%20run%20faster%2C%20and%20the%20reduced%20accuracy%20may%20not%20matter%20for%20some%20applications.%0A%0A%3EIn%20computational%20linguistics%2C%20lemmatisation%20is%20the%20algorithmic%20process%20of%20determining%20the%20lemma%20for%20a%20given%20word.%20**Since%20the%20process%20may%20involve%20complex%20tasks%20such%20as%20understanding%20context%20and%20determining%20the%20part%20of%20speech%20of%20a%20word%20in%20a%20sentence**%20%28requiring%2C%20for%20example%2C%20knowledge%20of%20the%20grammar%20of%20a%20language%29%20it%20can%20be%20a%20hard%20task%20to%20implement%20a%20lemmatiser%20for%20a%20new%20language.%0A%0A%u7B80%u800C%u8A00%u4E4B%uFF0C%u5C31%u662FLemmarization%u662F%u5305%u542B%u6709%u4E0A%u4E0B%u6587%u542B%u4E49%u7684%uFF0C%u800CStemming%u53EA%u662F%u5BF9%u5355%u4E2A%u5355%u8BCD%u8FDB%u884C%u6620%u5C04%u3002%0A%0ANLTK%u652F%u6301%u591A%u79CDStemmer%uFF0C%u5305%u62EC%u4F46%u4E0D%u9650%u4E8E%20Porter%20stemmer%2C%20Lancaster%20Stemmer%2C%20Snowball%20Stemmer%u3002%0A%60%60%60python%0A%3E%3E%3E%20from%20nltk.stem%20import%20SnowballStemmer%0A%3E%3E%3E%20snowball_stemmer%20%3D%20SnowballStemmer%28%u201Cenglish%u201D%29%0A%3E%3E%3E%20snowball_stemmer.stem%28%u2018maximum%u2019%29%0Au%u2019maximum%u2019%0A%3E%3E%3E%20snowball_stemmer.stem%28%u2018presumably%u2019%29%0Au%u2019presum%u2019%0A%3E%3E%3E%20snowball_stemmer.stem%28%u2018multiply%u2019%29%0Au%u2019multipli%u2019%0A%60%60%60%0ANLTK%u4E2D%u7684Lemarize%3A%20%0A%60%60%60python%0A%3E%3E%3E%20from%20nltk.stem%20import%20WordNetLemmatizer%0A%3E%3E%3E%20wordnet_lemmatizer%20%3D%20WordNetLemmatizer%28%29%0A%3E%3E%3E%20wordnet_lemmatizer.lemmatize%28%u2018dogs%u2019%29%0Au%u2019dog%u2019%0A%3E%3E%3E%20wordnet_lemmatizer.lemmatize%28%u2018churches%u2019%29%0Au%u2019church%u2019%0A%3E%3E%3E%20wordnet_lemmatizer.lemmatize%28%u2018is%u2019%2C%20pos%3D%u2019v%u2019%29%0Au%u2019be%u2019%0A%3E%3E%3E%20wordnet_lemmatizer.lemmatize%28%u2018are%u2019%2C%20pos%3D%u2019v%u2019%29%0Au%u2019be%u2019%0A%3E%3E%3E%0A%60%60%60%0A*%20pos%20%3D%20Part%20Of%20Speech

Edit

写博客的想法由来已久,更多的分享才有更多的获得。因为平时GitHub用的很多,所以选择在GitHub上搭建博客。一开始看到的是Jekyll,后来看到用Hexo的帖子,觉得Hexo也不错,于是就选定从Hexo开始了。
因为平时多用Evernote记录各种读书笔记,折腾历程,所以希望能找到一个直接用Evernote中笔记作为blog post的工具。最后在npm上发现了everblog,正好符合自己的需求,但当然免不了各种踩坑。好在最后顺利解决。本文会有所记录。

在GitHub上搭建Hexo博客

在google上一搜一大摞。贴出我参考的链接手把手教你使用Hexo + Github Pages搭建个人独立博客

简单来说就是:

  1. 在GitHub上注册一个如此命名的repo: <username>.github.io, e.g. zhougy0717.github.io
  2. 用hexo工具来初始化这个repo
  3. blog所有的配置信息都在_config.yml文件中,包括deploy的信息
deploy:
type: git
repo: https://github.com/zhougy0717/zhougy0717.github.io.git
branch: master
  1. 所有的post在source/_posts

Hexo的逻辑是在_posts下用Markdown写博客,然后调用hexo g对Markdown post进行渲染生成html,最后调用hexo s将生成的网页版post发布到GitHub上去,从而可以用.github.io的域名访问到该博客。

Everblog

项目地址: everblog
使用方法:

  1. npm install everblog -g
  2. 在主目录下添加.everblogrc,包含字段:
    • token
    • noteStoreUrl
    • notebook
  3. 在blog根目录下添加index.js
module.exports = require('everblog-adaptor-hexo-html')
  1. 在blog根目录下运行everblog build
  2. hexo s测试,hexo d部署

踩坑记录

EDAMSystemException: authenticationToken

各种翻看Evernote API和Evernote Nodejs SDK, 最后还是用代码单步得知,是serviceHost的问题。servicceHost默认采用”www.evernote.com”。而这个值是可以传给Evernote.Client的。而由于everblog将.everblogrc中的值直接传给Evernote.Client,所以我们可以在.everblogrc中添加serviceHost值。这样就可以成功build了。

HTML based Hexo blog

由于Hexo的原生逻辑是用Markdown写blog,而前文提到,本次搭建博客希望基于Evernote。这样每次用马克飞象写完笔记后,直接发布就可以了。但Windows下类MWeb的工具都得付钱。而且这些工具包括MWeb,都需自行维护笔记原始资源,然后将渲染后的Markdown发布到各Web Service。所以,还是希望能保留Evernote + 马克飞象工作方式。
everblog自带的adaptor是everblog-adaptor-hexo,其工作方式是导出笔记中的纯文本,然后生成Markdown文件发布。其基于everblog作者的另外一个project——enml2text。everblog作者还有一个project——enml2html。
于是最后的解决方式是,对笔记的content,调用enml2html生成html文件。但有一个问题,生成的html的文件无法显示图片。
这是因为enml2html生成图片路径的方式过期了,根据Evernote developer网页的方法,进行了更新并提交了pull request。

这时仍然有另一个问题,就是印象笔记的图片是不允许外链的。也就是在你没登录印象笔记网页用户的时候,博客里面的图片统统不能显示。于是乎,就只能将图片下载到本地,并在img src字段进行标注,来实现显示图片了。

本地图片显示

承接上文。幸运的是,网上是有这种需求的。于是搜到这篇网页:Hexo框架下给博客插入本地图片。于是基于这个网页中提到的方法,修改everblog-adaptor-hexo实现下载网页并能显示图片。但inline图片(MathJax渲染的公式图片)尺寸变得很大,查了一下evernote客户端导出的html,这些图片都得到了一定程度的缩放。不知道其缩放逻辑是什么。不想那么多,于是我对所有__SVG__开头的图片都进行了47.5%的缩小。

引用文字中对齐问题

Hexo theme的问题。试了几个them,最终用的是freemind没有这个问题。

%23%20%u5728GitHub%u4E0A%u7528Evernote+Hexo%u642D%u5EFA%u4E2A%u4EBA%u9759%u6001%u535A%u5BA2%0A@%28myblog%29%5Bhexo%2C%20GitHub%5D%0A%0A%5BTOC%5D%0A%0A%u5199%u535A%u5BA2%u7684%u60F3%u6CD5%u7531%u6765%u5DF2%u4E45%uFF0C%u66F4%u591A%u7684%u5206%u4EAB%u624D%u6709%u66F4%u591A%u7684%u83B7%u5F97%u3002%u56E0%u4E3A%u5E73%u65F6GitHub%u7528%u7684%u5F88%u591A%uFF0C%u6240%u4EE5%u9009%u62E9%u5728GitHub%u4E0A%u642D%u5EFA%u535A%u5BA2%u3002%u4E00%u5F00%u59CB%u770B%u5230%u7684%u662FJekyll%uFF0C%u540E%u6765%u770B%u5230%u7528Hexo%u7684%u5E16%u5B50%uFF0C%u89C9%u5F97Hexo%u4E5F%u4E0D%u9519%uFF0C%u4E8E%u662F%u5C31%u9009%u5B9A%u4ECEHexo%u5F00%u59CB%u4E86%u3002%0A%u56E0%u4E3A%u5E73%u65F6%u591A%u7528Evernote%u8BB0%u5F55%u5404%u79CD%u8BFB%u4E66%u7B14%u8BB0%uFF0C%u6298%u817E%u5386%u7A0B%uFF0C%u6240%u4EE5%u5E0C%u671B%u80FD%u627E%u5230%u4E00%u4E2A%u76F4%u63A5%u7528Evernote%u4E2D%u7B14%u8BB0%u4F5C%u4E3Ablog%20post%u7684%u5DE5%u5177%u3002%u6700%u540E%u5728npm%u4E0A%u53D1%u73B0%u4E86everblog%uFF0C%u6B63%u597D%u7B26%u5408%u81EA%u5DF1%u7684%u9700%u6C42%uFF0C%u4F46%u5F53%u7136%u514D%u4E0D%u4E86%u5404%u79CD%u8E29%u5751%u3002%u597D%u5728%u6700%u540E%u987A%u5229%u89E3%u51B3%u3002%u672C%u6587%u4F1A%u6709%u6240%u8BB0%u5F55%u3002%0A%0A%23%23%20%u5728GitHub%u4E0A%u642D%u5EFAHexo%u535A%u5BA2%0A%u5728google%u4E0A%u4E00%u641C%u4E00%u5927%u645E%u3002%u8D34%u51FA%u6211%u53C2%u8003%u7684%u94FE%u63A5%5B%u624B%u628A%u624B%u6559%u4F60%u4F7F%u7528Hexo%20+%20Github%20Pages%u642D%u5EFA%u4E2A%u4EBA%u72EC%u7ACB%u535A%u5BA2%5D%28https%3A//linghucong.js.org/2016/04/15/2016-04-15-hexo-github-pages-blog/%29%0A%0A%u7B80%u5355%u6765%u8BF4%u5C31%u662F%uFF1A%0A1.%20%u5728GitHub%u4E0A%u6CE8%u518C%u4E00%u4E2A%u5982%u6B64%u547D%u540D%u7684repo%3A%20%26lt%3Busername%26gt%3B.github.io%2C%20e.g.%20zhougy0717.github.io%0A2.%20%u7528hexo%u5DE5%u5177%u6765%u521D%u59CB%u5316%u8FD9%u4E2Arepo%0A3.%20blog%u6240%u6709%u7684%u914D%u7F6E%u4FE1%u606F%u90FD%u5728_config.yml%u6587%u4EF6%u4E2D%uFF0C%u5305%u62ECdeploy%u7684%u4FE1%u606F%0A%60%60%60yaml%0Adeploy%3A%0A%20%20type%3A%20git%0A%20%20repo%3A%20https%3A//github.com/zhougy0717/zhougy0717.github.io.git%0A%20%20branch%3A%20master%0A%60%60%60%0A4.%20%u6240%u6709%u7684post%u5728%60source/_posts%60%u4E0B%0A%0AHexo%u7684%u903B%u8F91%u662F%u5728_posts%u4E0B%u7528Markdown%u5199%u535A%u5BA2%uFF0C%u7136%u540E%u8C03%u7528%60hexo%20g%60%u5BF9Markdown%20post%u8FDB%u884C%u6E32%u67D3%u751F%u6210html%uFF0C%u6700%u540E%u8C03%u7528%60hexo%20s%60%u5C06%u751F%u6210%u7684%u7F51%u9875%u7248post%u53D1%u5E03%u5230GitHub%u4E0A%u53BB%uFF0C%u4ECE%u800C%u53EF%u4EE5%u7528.github.io%u7684%u57DF%u540D%u8BBF%u95EE%u5230%u8BE5%u535A%u5BA2%u3002%0A%0A%23%23Everblog%0A%u9879%u76EE%u5730%u5740%3A%20%5Beverblog%5D%28https%3A//github.com/everblogjs/everblog%29%0A%u4F7F%u7528%u65B9%u6CD5%uFF1A%0A1.%20npm%20install%20everblog%20-g%0A2.%20%u5728%u4E3B%u76EE%u5F55%u4E0B%u6DFB%u52A0.everblogrc%uFF0C%u5305%u542B%u5B57%u6BB5%uFF1A%0A%09-%20token%0A%09-%20noteStoreUrl%0A%09-%20notebook%0A3.%20%u5728blog%u6839%u76EE%u5F55%u4E0B%u6DFB%u52A0index.js%0A%60%60%60%20javascript%0Amodule.exports%20%3D%20require%28%27everblog-adaptor-hexo-html%27%29%0A%60%60%60%0A3.%20%u5728blog%u6839%u76EE%u5F55%u4E0B%u8FD0%u884C%60everblog%20build%60%0A4.%20%60hexo%20s%60%u6D4B%u8BD5%uFF0C%60hexo%20d%60%u90E8%u7F72%0A%0A%23%23%23%20%u8E29%u5751%u8BB0%u5F55%0A%23%23%23%23EDAMSystemException%3A%20authenticationToken%0A%u5404%u79CD%u7FFB%u770BEvernote%20API%u548CEvernote%20Nodejs%20SDK%uFF0C%20%u6700%u540E%u8FD8%u662F%u7528%u4EE3%u7801%u5355%u6B65%u5F97%u77E5%uFF0C%u662FserviceHost%u7684%u95EE%u9898%u3002servicceHost%u9ED8%u8BA4%u91C7%u7528%22www.evernote.com%22%u3002%u800C%u8FD9%u4E2A%u503C%u662F%u53EF%u4EE5%u4F20%u7ED9Evernote.Client%u7684%u3002%u800C%u7531%u4E8Eeverblog%u5C06.everblogrc%u4E2D%u7684%u503C%u76F4%u63A5%u4F20%u7ED9Evernote.Client%uFF0C%u6240%u4EE5%u6211%u4EEC%u53EF%u4EE5%u5728.everblogrc%u4E2D%u6DFB%u52A0serviceHost%u503C%u3002%u8FD9%u6837%u5C31%u53EF%u4EE5%u6210%u529Fbuild%u4E86%u3002%0A%0A%23%23%23%23HTML%20based%20Hexo%20blog%0A%u7531%u4E8EHexo%u7684%u539F%u751F%u903B%u8F91%u662F%u7528Markdown%u5199blog%uFF0C%u800C%u524D%u6587%u63D0%u5230%uFF0C%u672C%u6B21%u642D%u5EFA%u535A%u5BA2%u5E0C%u671B%u57FA%u4E8EEvernote%u3002%u8FD9%u6837%u6BCF%u6B21%u7528%u9A6C%u514B%u98DE%u8C61%u5199%u5B8C%u7B14%u8BB0%u540E%uFF0C%u76F4%u63A5%u53D1%u5E03%u5C31%u53EF%u4EE5%u4E86%u3002%u4F46Windows%u4E0B%u7C7BMWeb%u7684%u5DE5%u5177%u90FD%u5F97%u4ED8%u94B1%u3002%u800C%u4E14%u8FD9%u4E9B%u5DE5%u5177%u5305%u62ECMWeb%uFF0C%u90FD%u9700%u81EA%u884C%u7EF4%u62A4%u7B14%u8BB0%u539F%u59CB%u8D44%u6E90%uFF0C%u7136%u540E%u5C06%u6E32%u67D3%u540E%u7684Markdown%u53D1%u5E03%u5230%u5404Web%20Service%u3002%u6240%u4EE5%uFF0C%u8FD8%u662F%u5E0C%u671B%u80FD%u4FDD%u7559Evernote%20+%20%u9A6C%u514B%u98DE%u8C61%u5DE5%u4F5C%u65B9%u5F0F%u3002%0Aeverblog%u81EA%u5E26%u7684adaptor%u662Feverblog-adaptor-hexo%uFF0C%u5176%u5DE5%u4F5C%u65B9%u5F0F%u662F%u5BFC%u51FA%u7B14%u8BB0%u4E2D%u7684%u7EAF%u6587%u672C%uFF0C%u7136%u540E%u751F%u6210Markdown%u6587%u4EF6%u53D1%u5E03%u3002%u5176%u57FA%u4E8Eeverblog%u4F5C%u8005%u7684%u53E6%u5916%u4E00%u4E2Aproject%u2014%u2014enml2text%u3002everblog%u4F5C%u8005%u8FD8%u6709%u4E00%u4E2Aproject%u2014%u2014enml2html%u3002%0A%u4E8E%u662F%u6700%u540E%u7684%u89E3%u51B3%u65B9%u5F0F%u662F%uFF0C%u5BF9%u7B14%u8BB0%u7684content%uFF0C%u8C03%u7528enml2html%u751F%u6210html%u6587%u4EF6%u3002%u4F46%u6709%u4E00%u4E2A%u95EE%u9898%uFF0C%u751F%u6210%u7684html%u7684%u6587%u4EF6%u65E0%u6CD5%u663E%u793A%u56FE%u7247%u3002%0A%u8FD9%u662F%u56E0%u4E3Aenml2html%u751F%u6210%u56FE%u7247%u8DEF%u5F84%u7684%u65B9%u5F0F%u8FC7%u671F%u4E86%uFF0C%u6839%u636EEvernote%20developer%u7F51%u9875%u7684%u65B9%u6CD5%uFF0C%u8FDB%u884C%u4E86%u66F4%u65B0%u5E76%u63D0%u4EA4%u4E86pull%20request%u3002%0A%0A%u8FD9%u65F6%u4ECD%u7136%u6709%u53E6%u4E00%u4E2A%u95EE%u9898%uFF0C%u5C31%u662F%u5370%u8C61%u7B14%u8BB0%u7684%u56FE%u7247%u662F%u4E0D%u5141%u8BB8%u5916%u94FE%u7684%u3002%u4E5F%u5C31%u662F%u5728%u4F60%u6CA1%u767B%u5F55%u5370%u8C61%u7B14%u8BB0%u7F51%u9875%u7528%u6237%u7684%u65F6%u5019%uFF0C%u535A%u5BA2%u91CC%u9762%u7684%u56FE%u7247%u7EDF%u7EDF%u4E0D%u80FD%u663E%u793A%u3002%u4E8E%u662F%u4E4E%uFF0C%u5C31%u53EA%u80FD%u5C06%u56FE%u7247%u4E0B%u8F7D%u5230%u672C%u5730%uFF0C%u5E76%u5728img%20src%u5B57%u6BB5%u8FDB%u884C%u6807%u6CE8%uFF0C%u6765%u5B9E%u73B0%u663E%u793A%u56FE%u7247%u4E86%u3002%0A%0A%23%23%23%23%u672C%u5730%u56FE%u7247%u663E%u793A%0A%u627F%u63A5%u4E0A%u6587%u3002%u5E78%u8FD0%u7684%u662F%uFF0C%u7F51%u4E0A%u662F%u6709%u8FD9%u79CD%u9700%u6C42%u7684%u3002%u4E8E%u662F%u641C%u5230%u8FD9%u7BC7%u7F51%u9875%uFF1A%5BHexo%u6846%u67B6%u4E0B%u7ED9%u535A%u5BA2%u63D2%u5165%u672C%u5730%u56FE%u7247%5D%28https%3A//app.yinxiang.com/shard/s10/nl/161681/bee73bb9-bb81-45e9-b480-5260a70d3034%29%u3002%u4E8E%u662F%u57FA%u4E8E%u8FD9%u4E2A%u7F51%u9875%u4E2D%u63D0%u5230%u7684%u65B9%u6CD5%uFF0C%u4FEE%u6539everblog-adaptor-hexo%u5B9E%u73B0%u4E0B%u8F7D%u7F51%u9875%u5E76%u80FD%u663E%u793A%u56FE%u7247%u3002%u4F46inline%u56FE%u7247%uFF08MathJax%u6E32%u67D3%u7684%u516C%u5F0F%u56FE%u7247%uFF09%u5C3A%u5BF8%u53D8%u5F97%u5F88%u5927%uFF0C%u67E5%u4E86%u4E00%u4E0Bevernote%u5BA2%u6237%u7AEF%u5BFC%u51FA%u7684html%uFF0C%u8FD9%u4E9B%u56FE%u7247%u90FD%u5F97%u5230%u4E86%u4E00%u5B9A%u7A0B%u5EA6%u7684%u7F29%u653E%u3002%u4E0D%u77E5%u9053%u5176%u7F29%u653E%u903B%u8F91%u662F%u4EC0%u4E48%u3002%u4E0D%u60F3%u90A3%u4E48%u591A%uFF0C%u4E8E%u662F%u6211%u5BF9%u6240%u6709%60__SVG__%60%u5F00%u5934%u7684%u56FE%u7247%u90FD%u8FDB%u884C%u4E8647.5%25%u7684%u7F29%u5C0F%u3002%0A%0A%23%23%23%23%u5F15%u7528%u6587%u5B57%u4E2D%u5BF9%u9F50%u95EE%u9898%0AHexo%20theme%u7684%u95EE%u9898%u3002%u8BD5%u4E86%u51E0%u4E2Athem%uFF0C%u6700%u7EC8%u7528%u7684%u662Ffreemind%u6CA1%u6709%u8FD9%u4E2A%u95EE%u9898%u3002

Edit

Stochastic gradient descent

回忆线性回归的cost function:

m是sample数量,当m很大的时候,例如1,000,000,我们需要将所有的sample全部投入运算,这是很惊人的运算量。

Stochastic的意思就是每次只用一个sample,并不断进行迭代,从而达到和普通的gradient descent相同的效果。

  1. Randomly shuffle dataset
  2. Repeat
    • for i = 1, …, m
      • for j = 0, …, n

Mini-batch gradient descent

改进版的stochastic gradient descent

Batch gradient descent: Use all m examples in each iteration
Stochastic gradient descent: Use1 example in each iteration
Mini-batch gradient descent: Use b examples in each iteration

Say b = 10, m = 1000
Repeat{

  • for i = 1, 11, 21, 31,…,991{
    • for j = 0,…, n
    • }
  • }

在stochastic的图像中,我们可以看到,这种算法和batch算法比较,更多迂回婉转,难于收敛。所以在取的时候要格外小心。一般采用动态调整的办法设置

Learning rate is typically held constant. Can slowly decrease over timee if we want to converge.

Online learning

这其实是stochastic算法的一个应用。因为在每一次的迭代中,不需要所有的sample都参与,所以算法可以一边手机数据,一边进行迭代优化。

Map-reduce and data parallelism

%23Machine%20Learning%20%285%29%20-%20Large%20Scale%20Machine%20Learning%0A@%28myblog%29%5B%u6DF1%u5EA6%u5B66%u4E60%5D%0A%0A%0A%23%23Stochastic%20gradient%20descent%0A%0A%u56DE%u5FC6%u7EBF%u6027%u56DE%u5F52%u7684cost%20function%uFF1A%0A%24%24J%28%5Ctheta%29%20%3D%20%5Cdfrac%20%7B1%7D%7B2m%7D%5Csum_%7Bi%3D1%7D%5Em%28h_%5Ctheta%28x%5E%7B%28i%29%7D%29-y%5E%7B%28i%29%7D%29%5E2%24%24%0Am%u662Fsample%u6570%u91CF%uFF0C%u5F53m%u5F88%u5927%u7684%u65F6%u5019%uFF0C%u4F8B%u59821%2C000%2C000%uFF0C%u6211%u4EEC%u9700%u8981%u5C06%u6240%u6709%u7684sample%u5168%u90E8%u6295%u5165%u8FD0%u7B97%uFF0C%u8FD9%u662F%u5F88%u60CA%u4EBA%u7684%u8FD0%u7B97%u91CF%u3002%0A%0AStochastic%u7684%u610F%u601D%u5C31%u662F%u6BCF%u6B21%u53EA%u7528%u4E00%u4E2Asample%uFF0C%u5E76%u4E0D%u65AD%u8FDB%u884C%u8FED%u4EE3%uFF0C%u4ECE%u800C%u8FBE%u5230%u548C%u666E%u901A%u7684gradient%20descent%u76F8%u540C%u7684%u6548%u679C%u3002%0A%0A1.%20%24cost%28%5Ctheta%2C%20%28x%5E%7B%28i%29%7D%29%20-%20y%5E%7B%28i%29%7D%29%5E2%20%3D%20%5Cfrac%7B1%7D%7B2%7D%28h_%7B%5Ctheta%7D%28x%5E%7B%28i%29%7D%29%20-%20y%5E%7B%28i%29%7D%29%5E2%24%0A2.%20%24J_%7Btrain%7D%28%5Ctheta%29%20%3D%20%5Cfrac%20%7B1%7D%7Bm%7D%5Csum_%7Bi%3D1%7D%5Emcost%28%5Ctheta%2C%20%28x%5E%7B%28i%29%7D%2C%20y%5E%7B%28i%29%7D%29%29%24%0A3.%20Randomly%20shuffle%20dataset%0A4.%20Repeat%20%0A%09-%20for%20i%20%3D%201%2C%20...%2C%20m%0A%09%09%20*%20for%20j%20%3D%200%2C%20...%2C%20n%0A%09%09%09%24%5Ctheta_j%20%3A%3D%20%5Ctheta_j%20-%20%5Calpha%28h_%7B%5Ctheta%7D%28x%5E%7B%28i%29%7D%29-y%5E%7B%28i%29%7D%29x_j%5E%7B%28i%29%7D%24%0A%09%09%09%0A%21%5BAlt%20text%5D%28./1498184027550.png%29%0A%0A%23%23Mini-batch%20gradient%20descent%0A%u6539%u8FDB%u7248%u7684stochastic%20gradient%20descent%0A%3EBatch%20gradient%20descent%3A%20Use%20all%20m%20examples%20in%20each%20iteration%0AStochastic%20gradient%20descent%3A%20Use1%20example%20in%20each%20iteration%0AMini-batch%20gradient%20descent%3A%20Use%20b%20examples%20in%20each%20iteration%20%0A%0ASay%20b%20%3D%2010%2C%20m%20%3D%201000%0ARepeat%7B%0A-%20for%20i%20%3D%201%2C%2011%2C%2021%2C%2031%2C...%2C991%7B%0A%09*%20for%20j%20%3D%200%2C...%2C%20n%0A%09%09*%20%24%5Ctheta_j%20%3A%3D%20%5Ctheta_j%20-%20%5Calpha%20%5Cfrac%7B1%7D%7B10%7D%5Csum_%7Bk%3Di%7D%5E%7Bi+9%7D%28h_%7B%5Ctheta%7D%28x%5E%7B%28k%29%7D%29%20-%20y%5E%7B%28k%29%7D%29x_j%5E%7B%28k%29%7D%24%0A%09*%20%7D%0A-%20%7D%0A%0A%u5728stochastic%u7684%u56FE%u50CF%u4E2D%uFF0C%u6211%u4EEC%u53EF%u4EE5%u770B%u5230%uFF0C%u8FD9%u79CD%u7B97%u6CD5%u548Cbatch%u7B97%u6CD5%u6BD4%u8F83%uFF0C%u66F4%u591A%u8FC2%u56DE%u5A49%u8F6C%uFF0C%u96BE%u4E8E%u6536%u655B%u3002%u6240%u4EE5%u5728%u53D6%24%5Calpha%24%u7684%u65F6%u5019%u8981%u683C%u5916%u5C0F%u5FC3%u3002%u4E00%u822C%u91C7%u7528%u52A8%u6001%u8C03%u6574%u7684%u529E%u6CD5%u8BBE%u7F6E%24%5Calpha%24%0A%0A%3ELearning%20rate%20%24%5Calpha%24%20is%20typically%20held%20constant.%20Can%20slowly%20decrease%20%24%5Calpha%24%20over%20timee%20if%20we%20want%20%24%5Ctheta%24%20to%20converge.%0A%3E%24%24Eg%3A%20%5Calpha%20%3D%20%5Cfrac%20%7Bconst1%7D%7BiterationNumber%20+%20const2%7D%24%24%0A%0A%23%23Online%20learning%0A%u8FD9%u5176%u5B9E%u662Fstochastic%u7B97%u6CD5%u7684%u4E00%u4E2A%u5E94%u7528%u3002%u56E0%u4E3A%u5728%u6BCF%u4E00%u6B21%u7684%u8FED%u4EE3%u4E2D%uFF0C%u4E0D%u9700%u8981%u6240%u6709%u7684sample%u90FD%u53C2%u4E0E%uFF0C%u6240%u4EE5%u7B97%u6CD5%u53EF%u4EE5%u4E00%u8FB9%u624B%u673A%u6570%u636E%uFF0C%u4E00%u8FB9%u8FDB%u884C%u8FED%u4EE3%u4F18%u5316%u3002%0A%0A%23%23Map-reduce%20and%20data%20parallelism%0A%21%5BAlt%20text%5D%28./1498184980549.png%29%0A%u6240%u8C13%u7684map-reduce%u4E5F%u5C31%u662F%u5206%u5E03%u5F0F%u8FD0%u7B97%uFF0C%u51CF%u8F7B%u4E00%u5904%u7684%u8BA1%u7B97%u538B%u529B%uFF0C%u4FBF%u4E8E%u96C6%u7FA4%u8FD0%u7B97%u3002

Edit

Clustering - K-means Algorithm

步骤

  1. Randomly initialize K cluster centroids .
    1) Should have
    2) Randomly pick K training examples
    3) Set euqal to these K examples.
  2. Repeat
    for i=1 to m
    := index (from 1 to K) of cluster centroid closest to x^{(i)}
    归到离其最近的centroid
    for k = 1 to K
    := average (mean) of points assigned to cluster k.
    按照归到该centroid的所有sample来相应地移动该centroid

调优

目的

为了避免因为初始化不当,而产生local optima

Cost Function


调优就是使Cost Function最小化。具体的做法就是多做几次K-means (Andrew推荐100次),选取Cost Function最小的一组。

选择cluster number

Elvow method (肘关节法)

Dimensionality Reduction - Principal Component Analysis (PCA)

Dimensionality Reduction = 降维, 3D->2D

步骤

  1. Feature scaling/mean normalization.
    1)
    2)
    3) (optional) 是标准差
  2. Compute “covariance matrix”
  3. 调用系统库,Compute “eigenvectors” of matrix : [U,S,V] = svd(Sigma);
  4. Ureduce = U(:,1:k); z = Ureduce'*x;

Reconstruct

x = Ureduce * z

Choose k

Typically, choose k to be smallest value so that,

0.01 means 99% variance is retained. 保留了99%的变化。

另外一种通过svd函数计算variance的方法

  1. Start from k=1
  2. [U,S,V] = svd(Sigma)
  3. Pick the smallest k.

Application of PCA

  • Compression: 节省数据空间,提升计算速度
  • Visualization: k=2 or k = 3

不要滥用PCA。例如,用PCA减少feature set来消除Overfitting
Before implementing PCA, first try running whatever you want to do with the original/raw data . Only if that doesn’t do what you want, then implement PCA and consider using

Anomaly detection algorithm

我将之翻译为异常检测。实际就是对属于正态(或高斯)分布的特征量的非正常检测。

  1. Choose features that you think might be indicative of anomalous examples.
  2. Fit parameters
  3. Given new example , compute - 概率
  4. Anomaly if

高斯分布概率密度公式:

结果评估

如何对数据分类进行training和validate

Aircraft engines motivating example
10000 good(normal) engines
20 flawed engines (anomalous)

Training set: 6000 good engines
CV: 2000 good engines(y=0), 10 anomalous(y=1)
Test: 2000 good engines(y=0), 10 anomalous(y=1)
也就是把anomalous的sample省下来做validate和test用

特征量的选取

如果特征量不符合正态分布怎么办?按Andrew的建议是进行变换,则能得到近似正态的分布,如下:

Andrew在视频中用Octave命令行做了live demo:

讲义中提到的Error Analysis指的是:
有一些异常情况是,特征向量的各分量都在大概率的点,但是当某些分量组合时,会是小概率事件,则此时需要产生新的特征向量。例如:Andrew的例子,当CPU load相对较大,但仍在正常范围,但是network traffic却很小,但仍然在正常范围。这时你就需要一个变量,比如或者取决于新的特征量是否是正态分布。

Multivariate Gaussian distribution

前面的Anomaly Detection基于各特征分量是独立分布的情况,如果是非独立分布就采用上面提到的Error Analysis的方法。而Multivariate Gaussian则是对非独立分布的补充,也就是说上面的Anomaly Detection其实是Multivariate Gaussian的一种特殊情况。

Andrew给出了矢量化的Multivariate Gaussian的概率密度公式:


其中称为Covariance Matrix,协方差矩阵,的行列式(determinant),参考:

在上面给出的公式里,Covariance Matrix即就给出了变量之间的相关性的信息。

Original model和multivariate model都可以检测到上面举的CPU load和network traffic的例子中的情况。那要如何选择呢?Andrew给出了对比:

Anomaly detection vs. supervised learning

Recommender Systems

Content-based recommender systems

这个问题,其实就是已知每部电影的特征(特征向量),以及用户对电影的评分,来推测用户对没看过的电影的评分。看下面的表格:

其中:

  • : 用户数
  • : 电影数
  • : 用户j给电影i评分了
  • : 用户的真实评分
  • : hypothesis,推测评分

Gradient Descent:
for k = 0:

for k 0:

Collaborative filtering

假设在前面一节的training中,是已知的,而未知,即每部电影针对各个特征量的评分未知,此时通过training一样可以推测出各电影的评级。而Collabrative Filtering就是在两手空空的时候,既没有,也没有,而只有,就能预测

  1. Initialize to small random values.
  2. Minimize using gradient descent (or an advanced optimization algorithm).

  1. For a user with parameters and a movie with (learned) features , predict a star rating of

因为存在未给任何电影评分的用户,如果不做特殊处理,则这些用户对所有电影的预测评价都会是0分,具体Andrew在视频中给出论证。所以需要mean normalization,具体做法是:



  • 不用标准差除,因为评分本来就在近似的range

  • 1

Collaborative filtering矢量写法是:

而Collaborative filtering algorithm又叫Low rank matrix factorization。其中,就是Low rank matrix。

当要给用户推荐的时候,足够小的时候,就可以认为i和j足够相似。

%23Machine%20Learning%20%284%29%20-%20Unsupervised%20Learning%0A@%28%u5B66%u4E60%u7B14%u8BB0%29%5B%u6DF1%u5EA6%u5B66%u4E60%5D%0A%0A%5BTOC%5D%0A%0A%23%23Clustering%20-%20K-means%20Algorithm%0A%21%5BAlt%20text%7C400x0%5D%28./1495320402756.png%29%0A%21%5BAlt%20text%7C400x0%5D%28./1495320502218.png%29%0A%u6240%u8C13%u7684K-means%u5C31%u662F%u521D%u59CB%u5316%u4E24%u4E2Acluster%20centroid%uFF0C%u7136%u540E%u901A%u8FC7%u4E0D%u65AD%u8FED%u4EE3%uFF0C%u5C06%u4E4B%u79FB%u5230%u7C07%u7684%u4E2D%u592E%uFF0C%u4ECE%u800C%u8FBE%u5230%u5206%u7C7B%u3002%0A%0A%23%23%23%u6B65%u9AA4%0A1.%20Randomly%20initialize%20K%20cluster%20centroids%20.%0A%091%29%20Should%20have%20%24K%20%5Clt%20m%24%0A%092%29%20Randomly%20pick%20K%20training%20examples%0A%093%29%20Set%20%24%5Cmu_1%2C%20%5Cmu_2%2C...%2C%20%5Cmu_K%24%20euqal%20to%20these%20K%20examples.%0A2.%20Repeat%0A%09for%20i%3D1%20to%20m%0A%09%09%24c%5E%7B%28i%29%7D%24%20%3A%3D%20index%20%28from%201%20to%20K%29%20of%20cluster%20centroid%20closest%20to%20x%5E%7B%28i%29%7D%0A%09%09%u5C06%24x%5E%7B%28i%29%7D%24%u5F52%u5230%u79BB%u5176%u6700%u8FD1%u7684centroid%0A%09for%20k%20%3D%201%20to%20K%0A%09%09%24%5Cmu_k%24%20%3A%3D%20average%20%28mean%29%20of%20points%20assigned%20to%20cluster%20k.%0A%09%09%u6309%u7167%u5F52%u5230%u8BE5centroid%u7684%u6240%u6709sample%u6765%u76F8%u5E94%u5730%u79FB%u52A8%u8BE5centroid%0A%0A%23%23%23%u8C03%u4F18%0A%23%23%23%23%u76EE%u7684%0A%u4E3A%u4E86%u907F%u514D%u56E0%u4E3A%u521D%u59CB%u5316%u4E0D%u5F53%uFF0C%u800C%u4EA7%u751Flocal%20optima%0A%23%23%23%23Cost%20Function%0A%24J%28c%5E%7B%281%29%7D%2C%20c%5E%7B%282%29%7D%2C%20...%2C%20c%5E%7B%28m%29%7D%2C%20%5Cmu_1%2C%20%5Cmu_2%2C%20...%20%2C%5Cmu_K%29%20%3D%20%5Cdfrac%20%7B1%7D%7Bm%7D%20%5Csum_%7Bi%3D1%7D%5Em%7C%7Cx%5E%7B%28i%29%7D-%5Cmu_%7Bc%5E%7B%28i%29%7D%7D%7C%7C%5E2%24%0A%u8C03%u4F18%u5C31%u662F%u4F7FCost%20Function%u6700%u5C0F%u5316%u3002%u5177%u4F53%u7684%u505A%u6CD5%u5C31%u662F%u591A%u505A%u51E0%u6B21K-means%20%28Andrew%u63A8%u8350100%u6B21%29%uFF0C%u9009%u53D6Cost%20Function%u6700%u5C0F%u7684%u4E00%u7EC4%u3002%0A%0A%23%23%23%u9009%u62E9cluster%20number%0AElvow%20method%20%28%u8098%u5173%u8282%u6CD5%29%0A%21%5BAlt%20text%5D%28./1496198213650.png%29%0A%u4F46%u5982%u679C%u51FA%u73B0%u4E0B%u9762%u7684%u56FE%u50CF%uFF0C%u8098%u5173%u8282%u6CD5%u5C31%u5931%u6548%u4E86%0A%21%5BAlt%20text%5D%28./1496198240034.png%29%0A%u6309%u7167Andrew%u7684%u5EFA%u8BAE%uFF0C%u8FD8%u662F%u9700%u8981%u9886%u57DF%u903B%u8F91%u6765%u5224%u65AD%u9700%u8981%u5206%u4F5C%u51E0%u7C7B%u3002It%20depends%20on%20how%20to%20make%20the%20down%20stream%20or%20later%20process%20happy.%0A%0A%23%23Dimensionality%20Reduction%20-%20Principal%20Component%20Analysis%20%28PCA%29%0ADimensionality%20Reduction%20%3D%20%u964D%u7EF4%2C%203D-%3E2D%0A%23%23%23%u6B65%u9AA4%0A1.%20Feature%20scaling/mean%20normalization.%0A%091%29%20%24%5Cmu_i%20%3D%20%5Cdfrac%7B1%7D%7Bm%7D%5Csum_%7Bi%3D1%7D%5Emx%5E%7B%28i%29%7D_j%24%0A%092%29%20%24x_j%5E%7B%28i%29%7D%3Dx_j%5E%7B%28i%29%7D-%5Cmu_j%24%20%0A%093%29%20%28optional%29%20%24x_j%5E%7B%28i%29%7D%20%3D%20%5Cdfrac%20%7Bx_j%5E%7B%28i%29%7D-%5Cmu_j%7D%7Bs_j%7D%24%20%uFF0C%24s_j%24%u662F%u6807%u51C6%u5DEE%0A2.%20Compute%20%u201Ccovariance%20matrix%u201D%0A%24%24%5CSigma%20%3D%20%5Cdfrac%7B1%7D%7Bm%7D%5Csum_%7Bi%3D1%7D%5En%28x%5E%7B%28i%29%7D%29%28x%5E%7B%28i%29%7D%29%5ET%24%24%0A3.%20%u8C03%u7528%u7CFB%u7EDF%u5E93%uFF0CCompute%20%u201Ceigenvectors%u201D%20of%20matrix%20%24%5CSigma%24%3A%20%60%5BU%2CS%2CV%5D%20%3D%20svd%28Sigma%29%3B%60%0A4.%20%60Ureduce%20%3D%20U%28%3A%2C1%3Ak%29%3B%20z%20%3D%20Ureduce%27*x%3B%60%0A%0A%23%23%23Reconstruct%0A%60x%20%3D%20Ureduce%20*%20z%60%0A%0A%21%5BAlt%20text%5D%28./1496209550430.png%29%0A%0A%0A%23%23%23%20Choose%20k%0ATypically%2C%20choose%20k%20to%20be%20smallest%20value%20so%20that%2C%0A%24%24%5Cdfrac%20%7B%5Cdfrac%20%7B1%7D%7Bm%7D%7C%7Cx%5E%7B%28i%29%7D-x_%7Bapprox%7D%5E%7B%28i%29%7D%7C%7C%5E2%7D%7B%5Cdfrac%20%7B1%7D%7Bm%7D%7C%7Cx%5E%7B%28i%29%7D%7C%7C%5E2%7D%20%5Cleq%200.01%24%24%0A0.01%20means%2099%25%20variance%20is%20retained.%20%u4FDD%u7559%u4E8699%25%u7684%u53D8%u5316%u3002%0A%0A%u53E6%u5916%u4E00%u79CD%u901A%u8FC7svd%u51FD%u6570%u8BA1%u7B97variance%u7684%u65B9%u6CD5%0A1.%20Start%20from%20k%3D1%0A2.%20%60%5BU%2CS%2CV%5D%20%3D%20svd%28Sigma%29%60%0A2.%20%241%20-%20%5Cdfrac%20%7B%5Csum_%7Bi%3D1%7D%5EkS_%7Bii%7D%7D%7B%5Csum_%7Bi%3D1%7D%5EnS_%7Bii%7D%7D%20%5Cleq%200.01%24%0A3.%20Pick%20the%20smallest%20k.%0A%0A%23%23%23Application%20of%20PCA%0A-%20Compression%3A%20%u8282%u7701%u6570%u636E%u7A7A%u95F4%uFF0C%u63D0%u5347%u8BA1%u7B97%u901F%u5EA6%0A-%20Visualization%3A%20k%3D2%20or%20k%20%3D%203%0A%0A%u4E0D%u8981%u6EE5%u7528PCA%u3002%u4F8B%u5982%uFF0C%u7528PCA%u51CF%u5C11feature%20set%u6765%u6D88%u9664Overfitting%0ABefore%20implementing%20PCA%2C%20first%20try%20running%20whatever%20you%20want%20to%20do%20with%20the%20original/raw%20data%20%24x%5E%7B%28i%29%7D%24.%20Only%20if%20that%20doesn%u2019t%20do%20what%20you%20want%2C%20then%20implement%20PCA%20and%20consider%20using%20%24z%5E%7B%28i%29%7D%24%0A%0A%23%23Anomaly%20detection%20algorithm%0A%u6211%u5C06%u4E4B%u7FFB%u8BD1%u4E3A%u5F02%u5E38%u68C0%u6D4B%u3002%u5B9E%u9645%u5C31%u662F%u5BF9%u5C5E%u4E8E%u6B63%u6001%uFF08%u6216%u9AD8%u65AF%uFF09%u5206%u5E03%u7684%u7279%u5F81%u91CF%u7684%u975E%u6B63%u5E38%u68C0%u6D4B%u3002%0A1.%20Choose%20features%20%24x_i%24%20that%20you%20think%20might%20be%20indicative%20of%20anomalous%20examples.%0A2.%20Fit%20parameters%20%24%5Cmu_1%2C%20...%2C%5Cmu_n%2C%20%5Csigma_1%2C...%2C%5Csigma_n%24%0A%09%24%24%5Cmu_j%20%3D%20%5Cdfrac%20%7B1%7D%7Bm%7D%20%5Csum_%7Bi%3D1%7D%5Em%20x_j%5E%7B%28i%29%7D%24%24%0A%09%24%24%5Csigma_j%5E2%20%3D%20%5Cdfrac%20%7B1%7D%7Bm%7D%20%5Csum_%7Bi%3D1%7D%5Em%20%28x_j%5E%7B%28i%29%7D%20-%20%5Cmu_j%29%5E2%24%24%0A3.%20Given%20new%20example%20%24x%24%2C%20compute%20%24p%28x%29%24%20-%20%u6982%u7387%0A%09%24%24p%28x%29%20%3D%20%5Cprod_%7Bj%3D1%7D%5En%20p%28x_j%3B%20%5Cmu_j%2C%20%5Csigma_j%5E2%29%20%3D%20%5Cprod_%7Bj%3D1%7D%5En%20%5Cdfrac%20%7B1%7D%7B%5Csqrt%20%7B2%5Cpi%7D%5Csigma_j%7Dexp%20%5Cleft%28%7B-%5Cdfrac%20%7B%28x_j-%5Cmu_j%29%5E2%7D%7B2%20%5Csigma_j%5E2%7D%7D%5Cright%29%24%24%0A4.%20%20Anomaly%20if%20%24p%28x%29%20%5Cle%20%5Cepsilon%24%0A%0A%u9AD8%u65AF%u5206%u5E03%u6982%u7387%u5BC6%u5EA6%u516C%u5F0F%uFF1A%0A%24%24p%28x%29%20%3D%20%5Cdfrac%20%7B1%7D%7B%5Csqrt%20%7B2%5Cpi%7D%5Csigma%7Dexp%20%5Cleft%28%7B-%5Cdfrac%20%7B%28x-%5Cmu%29%5E2%7D%7B2%20%5Csigma%5E2%7D%7D%5Cright%29%24%24%0A%0A%23%23%23%u7ED3%u679C%u8BC4%u4F30%0A%u5982%u4F55%u5BF9%u6570%u636E%u5206%u7C7B%u8FDB%u884Ctraining%u548Cvalidate%0A%0A%3EAircraft%20engines%20motivating%20example%20%0A10000%20%20good%28normal%29%20engines%0A20%20%20flawed%20engines%20%28anomalous%29%0A%0A%3ETraining%20set%3A%206000%20good%20engines%0ACV%3A%202000%20good%20engines%28y%3D0%29%2C%2010%20anomalous%28y%3D1%29%0ATest%3A%202000%20good%20engines%28y%3D0%29%2C%2010%20anomalous%28y%3D1%29%0A%u4E5F%u5C31%u662F%u628Aanomalous%u7684sample%u7701%u4E0B%u6765%u505Avalidate%u548Ctest%u7528%0A%23%23%23%u7279%u5F81%u91CF%u7684%u9009%u53D6%0A%u5982%u679C%u7279%u5F81%u91CF%u4E0D%u7B26%u5408%u6B63%u6001%u5206%u5E03%u600E%u4E48%u529E%uFF1F%u6309Andrew%u7684%u5EFA%u8BAE%u662F%u8FDB%u884C%u53D8%u6362%uFF0C%u5219%u80FD%u5F97%u5230%u8FD1%u4F3C%u6B63%u6001%u7684%u5206%u5E03%uFF0C%u5982%u4E0B%uFF1A%0A%21%5BAlt%20text%5D%28./1496744051794.png%29%0A%u4E00%u822C%u7528%u5230%u7684%u65B9%u6CD5%u6709%uFF1A%0A%24x_%7Bnew%7D%20%3D%20log%28x+c%29%24%0A%24x_%7Bnew%7D%20%3D%20x%5E%7B%5Cfrac%20%7B1%7D%7Bn%7D%7D%24%0A%0AAndrew%u5728%u89C6%u9891%u4E2D%u7528Octave%u547D%u4EE4%u884C%u505A%u4E86live%20demo%uFF1A%0A%21%5BAlt%20text%5D%28./1496744359543.png%29%0A%u6211%u4EEC%u5F80%u5F80%u53EF%u4EE5%u901A%u8FC7hist%u547D%u4EE4%u6765%u753B%u56FE%u67E5%u770B%u4FEE%u6B63%u7684%u7279%u5F81%u91CF%u662F%u5426%u5C5E%u4E8E%u6B63%u6001%u5206%u5E03%u3002%0A%0A%u8BB2%u4E49%u4E2D%u63D0%u5230%u7684**Error%20Analysis**%u6307%u7684%u662F%uFF1A%0A%u6709%u4E00%u4E9B%u5F02%u5E38%u60C5%u51B5%u662F%uFF0C%u7279%u5F81%u5411%u91CF%u7684%u5404%u5206%u91CF%u90FD%u5728%u5927%u6982%u7387%u7684%u70B9%uFF0C%u4F46%u662F%u5F53%u67D0%u4E9B%u5206%u91CF%u7EC4%u5408%u65F6%uFF0C%u4F1A%u662F%u5C0F%u6982%u7387%u4E8B%u4EF6%uFF0C%u5219%u6B64%u65F6%u9700%u8981%u4EA7%u751F%u65B0%u7684%u7279%u5F81%u5411%u91CF%u3002%u4F8B%u5982%uFF1AAndrew%u7684%u4F8B%u5B50%uFF0C%u5F53CPU%20load%u76F8%u5BF9%u8F83%u5927%uFF0C%u4F46%u4ECD%u5728%u6B63%u5E38%u8303%u56F4%uFF0C%u4F46%u662Fnetwork%20traffic%u5374%u5F88%u5C0F%uFF0C%u4F46%u4ECD%u7136%u5728%u6B63%u5E38%u8303%u56F4%u3002%u8FD9%u65F6%u4F60%u5C31%u9700%u8981%u4E00%u4E2A%u53D8%u91CF%uFF0C%u6BD4%u5982%24x_5%20%3D%20%5Cfrac%20%7BCPU%5C%20load%7D%20%7Bnetwork%5C%20traffic%7D%24%u6216%u8005%24x_6%20%3D%20%5Cfrac%20%7B%28CPU%5C%20load%29%5E2%7D%7Bnetwork%5C%20traffic%7D%24%u53D6%u51B3%u4E8E%u65B0%u7684%u7279%u5F81%u91CF%u662F%u5426%u662F%u6B63%u6001%u5206%u5E03%u3002%0A%23%23%23Multivariate%20Gaussian%20distribution%0A%u524D%u9762%u7684Anomaly%20Detection%u57FA%u4E8E%u5404%u7279%u5F81%u5206%u91CF%u662F%u72EC%u7ACB%u5206%u5E03%u7684%u60C5%u51B5%uFF0C%u5982%u679C%u662F%u975E%u72EC%u7ACB%u5206%u5E03%u5C31%u91C7%u7528%u4E0A%u9762%u63D0%u5230%u7684**Error%20Analysis**%u7684%u65B9%u6CD5%u3002%u800CMultivariate%20Gaussian%u5219%u662F%u5BF9%u975E%u72EC%u7ACB%u5206%u5E03%u7684%u8865%u5145%uFF0C%u4E5F%u5C31%u662F%u8BF4%u4E0A%u9762%u7684Anomaly%20Detection%u5176%u5B9E%u662FMultivariate%20Gaussian%u7684%u4E00%u79CD%u7279%u6B8A%u60C5%u51B5%u3002%0A%0AAndrew%u7ED9%u51FA%u4E86%u77E2%u91CF%u5316%u7684Multivariate%20Gaussian%u7684%u6982%u7387%u5BC6%u5EA6%u516C%u5F0F%uFF1A%0A%24%24p%28x%3B%5Cmu%2C%20%5CSigma%29%20%3D%20%5Cdfrac%20%7B1%7D%7B%282%5Cpi%29%5E%7B%5Cfrac%20%7Bn%7D%7B2%7D%7D%5Cleft%20%7C%20%5CSigma%5Cright%7C%5E%7B%5Cfrac%7B1%7D%7B2%7D%7D%7Dexp%20%5Cleft%28-%5Cfrac%7B1%7D%7B2%7D%28x-%5Cmu%29%5ET%5CSigma%5E%7B-1%7D%28x-%5Cmu%29%5Cright%29%24%24%0A%24%5Cmu%20%3D%20%5Cfrac%20%7B1%7D%7Bm%7D%5Csum_%7Bi%3D1%7D%5Em%20x%5E%7B%28i%29%7D%24%0A%24%5CSigma%20%3D%20%5Cfrac%20%7B1%7D%7Bm%7D%5Csum_%7Bi%3D1%7D%5Em%20%28x%5E%7B%28i%29%7D-%5Cmu%29%28x%5E%7B%28i%29%7D-%5Cmu%29%5ET%24%0A%3E%20%0A%u5176%u4E2D%24%5CSigma%24%u79F0%u4E3ACovariance%20Matrix%uFF0C%u534F%u65B9%u5DEE%u77E9%u9635%uFF0C%24%7C%5CSigma%7C%24%u4E3A%24%5CSigma%24%u7684%u884C%u5217%u5F0F%28determinant%29%uFF0C%u53C2%u8003%uFF1A%0A-%20%5Bwiki%20-%20%u884C%u5217%u5F0F%5D%28https%3A//zh.wikipedia.org/wiki/%25E8%25A1%258C%25E5%2588%2597%25E5%25BC%258F%29%0A%u7B80%u5355%u6458%u53D6%u4E24%u5F20%u8BA1%u7B97%u77E9%u9635%u884C%u5217%u5F0F%u7684%u63D2%u56FE%uFF1A%0A%3E%21%5BAlt%20text%5D%28./1496745897356.png%29%0A%3E%0A%3E%21%5BAlt%20text%5D%28./1496745910661.png%29%0A%3EOctave%u91CC%u9762%u7528%u51FD%u6570%60det%28A%29%60%0A%0A%u5728%u4E0A%u9762%u7ED9%u51FA%u7684%u516C%u5F0F%u91CC%uFF0CCovariance%20Matrix%u5373%24%5CSigma%24%u5C31%u7ED9%u51FA%u4E86%u53D8%u91CF%u4E4B%u95F4%u7684%u76F8%u5173%u6027%u7684%u4FE1%u606F%u3002%0A%0AOriginal%20model%u548Cmultivariate%20model%u90FD%u53EF%u4EE5%u68C0%u6D4B%u5230%u4E0A%u9762%u4E3E%u7684CPU%20load%u548Cnetwork%20traffic%u7684%u4F8B%u5B50%u4E2D%u7684%u60C5%u51B5%u3002%u90A3%u8981%u5982%u4F55%u9009%u62E9%u5462%uFF1FAndrew%u7ED9%u51FA%u4E86%u5BF9%u6BD4%uFF1A%0A%21%5BAlt%20text%5D%28./1496746985199.png%29%0A%u53EF%u89C1%u867D%u7136%u770B%u8D77%u6765Multivariate%u6B63%u89C4%u4E00%u4E9B%uFF0C%u4F46%u662FOriginal%20model%u66F4%u5B9E%u7528%u4E00%u70B9%u3002%0A%23%23%23Anomaly%20detection%20vs.%20supervised%20learning%0A%21%5BAlt%20text%5D%28./1496580685327.png%29%0A%21%5BAlt%20text%5D%28./1496580714267.png%29%0A%0A%23%23Recommender%20Systems%0A%23%23%23Content-based%20recommender%20systems%0A%u8FD9%u4E2A%u95EE%u9898%uFF0C%u5176%u5B9E%u5C31%u662F%u5DF2%u77E5%u6BCF%u90E8%u7535%u5F71%u7684%u7279%u5F81%28%u7279%u5F81%u5411%u91CF%29%uFF0C%u4EE5%u53CA%u7528%u6237%u5BF9%u7535%u5F71%u7684%u8BC4%u5206%uFF0C%u6765%u63A8%u6D4B%u7528%u6237%u5BF9%u6CA1%u770B%u8FC7%u7684%u7535%u5F71%u7684%u8BC4%u5206%u3002%u770B%u4E0B%u9762%u7684%u8868%u683C%uFF1A%0A%21%5BAlt%20text%5D%28./1497231459572.png%29%0A%u65B9%u6CD5%u5C31%u662F%u7EBF%u6027%u56DE%u5F52%uFF0C%u6C42%u4E0B%u9762cost%20function%u7684%u6700%u5C0F%u503C%uFF1A%0A%24%24min_%7B%5Ctheta%5E%7B%281%29%7D%2C%20%5Ctheta%5E%7B%282%29%7D%2C...%2C%20%5Ctheta%5E%7B%28n_u%29%7D%7D%5Cfrac%7B1%7D%7B2%7D%5Csum_%7Bj%3D1%7D%5E%7Bn_u%7D%5Csum_%7Bi%3Ar%28i%2Cj%29%3D1%7D%5Cleft%20%28%28%5Ctheta%5E%7B%28j%29%7D%29%5ETx%5E%7B%28i%29%7D%20-%20y%5E%7B%28i%2Cj%29%7D%5Cright%29%5E2%20+%20%5Cfrac%7B%5Clambda%7D%7B2%7D%5Csum_%7Bj%3D1%7D%5E%7Bn_u%7D%5Csum_%7Bk%3D1%7D%5En%5Cleft%28%5Ctheta_k%5E%7B%28j%29%7D%5Cright%29%5E2%24%24%0A%u5176%u4E2D%uFF1A%0A-%20%24n_u%24%3A%20%u7528%u6237%u6570%0A-%20%24n%24%3A%20%u7535%u5F71%u6570%0A-%20%24i%3Ar%28i%2Cj%29%20%3D%201%24%3A%20%u7528%u6237j%u7ED9%u7535%u5F71i%u8BC4%u5206%u4E86%0A-%20%24y%5E%7B%28i%2Cj%29%7D%24%3A%20%u7528%u6237%u7684%u771F%u5B9E%u8BC4%u5206%0A-%20%24%28%5Ctheta%5E%7B%28j%29%7D%29%5ETx%5E%7B%28i%29%7D%24%3A%20hypothesis%uFF0C%u63A8%u6D4B%u8BC4%u5206%0A%0A**Gradient%20Descent**%3A%0Afor%20k%20%3D%200%3A%0A%24%5Ctheta_k%5E%7B%28j%29%7D%20%3A%3D%20%5Ctheta_k%5E%7B%28j%29%7D%20-%20%5Calpha%5Csum_%7Bi%3Ar%28i%2Cj%29%3D1%7D%5Cleft%28%28%5Ctheta%5E%7B%28j%29%7D%29%5ETx%5E%7B%28i%29%7D-y%5E%7B%28i%2Cj%29%7D%5Cright%29x_k%5E%7B%28i%29%7D%24%20%0Afor%20k%24%5Cneq%24%200%3A%0A%24%5Ctheta_k%5E%7B%28j%29%7D%20%3A%3D%20%5Ctheta_k%5E%7B%28j%29%7D%20-%20%5Calpha%5Cleft%28%5Csum_%7Bi%3Ar%28i%2Cj%29%3D1%7D%5Cleft%28%28%5Ctheta%5E%7B%28j%29%7D%29%5ETx%5E%7B%28i%29%7D-y%5E%7B%28i%2Cj%29%7D%5Cright%29x_k%5E%7B%28i%29%7D%20+%20%5Clambda%5Ctheta_k%5E%7B%28j%29%7D%5Cright%29%24%20%0A%0A%23%23%23Collaborative%20filtering%0A%u5047%u8BBE%u5728%u524D%u9762%u4E00%u8282%u7684training%u4E2D%uFF0C%24%5Ctheta_k%5E%7B%28j%29%7D%24%u662F%u5DF2%u77E5%u7684%uFF0C%u800C%24x%5E%7B%28i%29%7D%24%u672A%u77E5%uFF0C%u5373%u6BCF%u90E8%u7535%u5F71%u9488%u5BF9%u5404%u4E2A%u7279%u5F81%u91CF%u7684%u8BC4%u5206%u672A%u77E5%uFF0C%u6B64%u65F6%u901A%u8FC7training%u4E00%u6837%u53EF%u4EE5%u63A8%u6D4B%u51FA%u5404%u7535%u5F71%u7684%u8BC4%u7EA7%u3002%u800CCollabrative%20Filtering%u5C31%u662F%u5728%u4E24%u624B%u7A7A%u7A7A%u7684%u65F6%u5019%uFF0C%u65E2%u6CA1%u6709%24%5Ctheta%24%uFF0C%u4E5F%u6CA1%u6709%24x%24%uFF0C%u800C%u53EA%u6709%24y%5E%7B%28i%2Cj%29%7D%24%uFF0C%u5C31%u80FD%u9884%u6D4B%24%5Ctheta%24%u548C%24x%24%u3002%0A%0A1.%20Initialize%20%24x%5E%7B%281%29%7D%2C%20x%5E%7B%281%29%7D%2C%20...%2C%20x%5E%7B%28n_m%29%7D%2C%20%5Ctheta%5E%7B%281%29%7D%20%2C%20%5Ctheta%5E%7B%282%29%7D%2C...%2C%5Ctheta%5E%7B%28n_u%29%7D%24%20to%20small%20random%20values.%0A2.%20Minimize%20%24J%28x%5E%7B%281%29%7D%2C%20x%5E%7B%281%29%7D%2C%20...%2C%20x%5E%7B%28n_m%29%7D%2C%20%5Ctheta%5E%7B%281%29%7D%20%2C%20%5Ctheta%5E%7B%282%29%7D%2C...%2C%5Ctheta%5E%7B%28n_u%29%7D%29%24using%20gradient%20descent%20%28or%20an%20advanced%20optimization%20algorithm%29.%0A%24J%20%3D%20%5Cfrac%7B1%7D%7B2%7D%5Csum_%7B%28i%2Cj%29%3Ar%28i%2Cj%29%3D1%7D%5Cleft%28%28%5Ctheta%5E%7B%28j%29%7D%29%5ETx%5E%7B%28i%29%7D%20-%20y%5E%7B%28i%2Cj%29%7D%5Cright%29%5E2%20+%20%5Cfrac%7B%5Clambda%7D%7B2%7D%5Csum_%7Bi%3D1%7D%5E%7Bn_m%7D%5Csum_%7Bk%3D1%7D%5En%5Cleft%28x_k%5E%7B%28i%29%7D%5Cright%29%5E2+%5Cfrac%7B%5Clambda%7D%7B2%7D%5Csum_%7Bi%3D1%7D%5E%7Bn_m%7D%5Csum_%7Bk%3D1%7D%5En%5Cleft%28%5Ctheta_k%5E%7B%28i%29%7D%5Cright%29%5E2%24%0A%21%5BAlt%20text%5D%28./1497233522468.png%29%0A3.%20For%20a%20user%20with%20parameters%20%24%5Ctheta%24%20and%20a%20movie%20with%20%28learned%29%20features%20%24x%24%2C%20predict%20a%20star%20rating%20of%20%24%5Ctheta%5ETx%24%0A%3E%u56E0%u4E3A%u5B58%u5728%u672A%u7ED9%u4EFB%u4F55%u7535%u5F71%u8BC4%u5206%u7684%u7528%u6237%uFF0C%u5982%u679C%u4E0D%u505A%u7279%u6B8A%u5904%u7406%uFF0C%u5219%u8FD9%u4E9B%u7528%u6237%u5BF9%u6240%u6709%u7535%u5F71%u7684%u9884%u6D4B%u8BC4%u4EF7%u90FD%u4F1A%u662F0%u5206%uFF0C%u5177%u4F53Andrew%u5728%u89C6%u9891%u4E2D%u7ED9%u51FA%u8BBA%u8BC1%u3002%u6240%u4EE5%u9700%u8981mean%20normalization%uFF0C%u5177%u4F53%u505A%u6CD5%u662F%uFF1A%0A%3E-%20%24%5Cmu%20%3D%20%5Cfrac%20%7B1%7D%7Bm%7D%20%5Csum_1%5Em%20y%5E%7Bi%7D%24%0A%3E-%20%24Y%20%3D%20Y%20-%20%5Cmu%24%0A%3E-%20%u4E0D%u7528%u6807%u51C6%u5DEE%u9664%uFF0C%u56E0%u4E3A%u8BC4%u5206%u672C%u6765%u5C31%u5728%u8FD1%u4F3C%u7684range%0A%0A1%0A%0A%0ACollaborative%20filtering%u77E2%u91CF%u5199%u6CD5%u662F%uFF1A%0A%24%24h_%7B%5CTheta%7D%20%3D%20X%5CTheta%5ET%24%24%0A%u800CCollaborative%20filtering%20algorithm%u53C8%u53EBLow%20rank%20matrix%20factorization%u3002%u5176%u4E2D%uFF0C%24X%5CTheta%5ET%24%u5C31%u662FLow%20rank%20matrix%u3002%0A%0A%3E%u5F53%u8981%u7ED9%u7528%u6237%u63A8%u8350%u7684%u65F6%u5019%uFF0C%24%7C%7Cx_j%20-%20x_i%7C%7C%24%u8DB3%u591F%u5C0F%u7684%u65F6%u5019%uFF0C%u5C31%u53EF%u4EE5%u8BA4%u4E3Ai%u548Cj%u8DB3%u591F%u76F8%u4F3C%u3002%0A%0A%0A%0A%0A%0A%0A%0A%09%09%0A

Edit


SVM中文译名:支持向量机

特点是large margin,相对logistic regression可以容易得到全局最优解。

工作原理

基于land marks,如下

Hypothesis

Logistic Regression的Hypothesis为

对以上公式的两部分进行拟合,参看下图:

SVM Decision Boundary

Kernel

Gaussian Kernel

Steps

  1. Given
  2. Choose
  3. x->f
  4. Predict’“y=1”’if

Note: Do perform feature scaling before using the Gaussian kernel.

Multiclass classification:
Use one vs. all method. (Train K SVMs, one to distinguish y= i from the rest, for i = 1, 2,…,K), get
对于新的输入x,选取使最大的i。

Parameters

C =
- Large C: small Lower’bias,’high’variance.
- Small C: big Higher’bias,’low’variance.

  • Large : Features vary more smoothly. Higher bias, lower variance.

  • Small : Features vary less smoothly. Lower bias, higher variance.

Andrew提到还有很多其他的Kernel,但是用处不是很多,包括:

  • Polynomial kernel
  • String’kernel
  • chiIsquare’kernel
  • histogram intersection kernel,

Logistic regression vs. SVMs

n = number of features (), m = number of training examples

  • If n is large (relative to m): Use logistic regression, or SVM without a kernel (“linear kernel”).
  • If n is small, m is intermediate: Use SVM with Gaussian kernel.
  • If n is small, m is large: Create/add more features, then use logistic regression or SVM without a kernel.
  • Neural network likely to work well for most of these settings, but may be slower to train.

Recommendation for SVM implemetation

LIBSVM

%23Machine%20Learning%20%283%29%20-%20Support%20Vector%20Machine%20%28SVM%29%0A@%28%u5B66%u4E60%u7B14%u8BB0%29%5B%u6DF1%u5EA6%u5B66%u4E60%5D%0ASVM%u4E2D%u6587%u8BD1%u540D%uFF1A%u652F%u6301%u5411%u91CF%u673A%0A%0A%u7279%u70B9%u662Flarge%20margin%uFF0C%u76F8%u5BF9logistic%20regression%u53EF%u4EE5%u5BB9%u6613%u5F97%u5230%u5168%u5C40%u6700%u4F18%u89E3%u3002%0A%0A%23%23%u5DE5%u4F5C%u539F%u7406%0A%u57FA%u4E8Eland%20marks%uFF0C%u5982%u4E0B%0A%21%5BAlt%20text%7C600x0%5D%28./1494841637067.png%29%0A%u5F53%u8BAD%u7EC3%u51FA%24%5Ctheta%24%u503C%u5982%u4E0A%u6240%u793A%uFF0C%u5F53X%u53D6%u503C%u5728%24l%5E%7B%281%29%7D%24%2C%20%24l%5E%7B%282%29%7D%24%u9644%u8FD1%u65F6%uFF0Cy%u9884%u6D4B%u4E3A1%uFF0C%u5F53%u5728%24l%5E%7B%283%29%7D%24%u9644%u8FD1%u65F6%u4E3A0%u3002%u600E%u4E48%u505A%u5230%u7684%u5462%uFF1F%0A%23%23%23Hypothesis%0ALogistic%20Regression%u7684Hypothesis%u4E3A%0A%24h_%7B%5Ctheta%7D%28x%29%20%3D%20%20-y%20%5Cdfrac%20%7B1%7D%7B1+e%5E%7B-%5Ctheta%5ETx%7D%7D%20-%20%281-y%29%5Cdfrac%20%7B1%7D%7B1+e%5E%7B-%5Ctheta%5ETx%7D%7D%24%0A%0A%u5BF9%u4EE5%u4E0A%u516C%u5F0F%u7684%u4E24%u90E8%u5206%u8FDB%u884C%u62DF%u5408%uFF0C%u53C2%u770B%u4E0B%u56FE%uFF1A%0A%21%5BAlt%20text%7C600x0%5D%28./1494842281122.png%29%0A%u6709%u7528%u7684%u90E8%u5206%u5C31%u662Fz%u8F74%u4E0A%u90A3%u4E00%u7AEF%uFF0C%u659C%u7EBF%u90E8%u5206%u659C%u7387%u65E0%u5173%u7D27%u8981%0A%0ASVM%20Decision%20Boundary%0A%24min_%5Ctheta%20C%5Csum_%7Bi%3D1%7D%5E%7Bm%7D%5By%5E%7B%28i%29%7Dcost_1%28%5Ctheta%5ETx%5E%7B%28i%29%7D%29%20+%20%281-y%5E%7B%28i%29%7D%29cost_0%28%5Ctheta%5ETx%5E%7B%28i%29%7D%29%5D%20+%20%5Cdfrac%7B1%7D%7B2%7D%5Csum_%7Bi%3D1%7D%5E%7Bn%7D%5Ctheta_j%5E2%24%0A%24C%3D%5Cdfrac%7B1%7D%7B%5Clambda%7D%24%0A%0A%23%23%23Kernel%0AGaussian%20Kernel%0A%24f_i%20%3D%20similarity%28x%2C%20l%5E%7B%28i%29%7D%29%20%3D%20exp%28-%5Cdfrac%20%7B%7C%7Cx-l%5E%7B%28i%29%7D%7C%7C%5E2%7D%7B2%5Csigma%5E2%7D%29%24%0A%0A%23%23%23Steps%0A1.%20Given%24%20%28x%5E%7B%281%29%7D%2C%20y%5E%7B%281%29%7D%29%2C%20%28x%5E%7B%282%29%7D%2C%20y%5E%7B%282%29%7D%29%2C...%2C%28x%5E%7B%28m%29%7D%2C%20y%5E%7B%28m%29%7D%29%24%0A2.%20Choose%20%24l%5E%7B%281%29%7D%20%3D%20x%5E%7B%281%29%7D%2C%20l%5E%7B%282%29%7D%20%3D%20x%5E%7B%282%29%7D%2C%20...%2Cl%5E%7B%28m%29%7D%20%3D%20x%5E%7B%28m%29%7D%24%0A3.%20x-%3Ef%0A4.%20Predict%27%u201Cy%3D1%u201D%27if%20%24%5Ctheta%5ETf%20%5Cgeq%200%24%0A%0A**Note**%3A%20Do%20perform%20feature%20scaling%20before%20using%20the%20Gaussian%20kernel.%0A%0AMulticlass%20classification%3A%0AUse%20one%20vs.%20all%20method.%20%28Train%20K%20SVMs%2C%20one%20to%20distinguish%20y%3D%20i%20from%20the%20rest%2C%20for%20i%20%3D%201%2C%202%2C...%2CK%29%2C%20get%20%24%5Ctheta%5E%7B%281%29%7D%2C%20%5Ctheta%5E%7B%282%29%7D%2C...%2C%5Ctheta%5E%7B%28K%29%7D%24%0A%u5BF9%u4E8E%u65B0%u7684%u8F93%u5165x%uFF0C%u9009%u53D6%u4F7F%24%28%5Ctheta%5E%7B%28i%29%7D%29%5ETx%24%u6700%u5927%u7684i%u3002%0A%0A%23%23%23Parameters%0AC%20%3D%20%24%5Cdfrac%20%7B1%7D%7B%5Clambda%7D%24%09%0A%09-%20Large%20C%3A%20small%20%24%5Clambda%24%20Lower%27bias%2C%27high%27variance.%0A%09-%20Small%20C%3A%20big%20%24%5Clambda%24%20Higher%27bias%2C%27low%27variance.%0A%0A%24%5Csigma%5E2%24%0A-%20Large%20%24%5Csigma%5E2%24%3A%20Features%20%24f_i%24%20vary%20more%20smoothly.%20Higher%20bias%2C%20lower%20variance.%0A%20%21%5BAlt%20text%5D%28./1494987748588.png%29%0A-%20Small%20%24%5Csigma%5E2%24%3A%20Features%20%24f_i%24%20vary%20less%20smoothly.%20Lower%20bias%2C%20higher%20variance.%0A%21%5BAlt%20text%5D%28./1494987812098.png%29%0A%0AAndrew%u63D0%u5230%u8FD8%u6709%u5F88%u591A%u5176%u4ED6%u7684Kernel%uFF0C%u4F46%u662F%u7528%u5904%u4E0D%u662F%u5F88%u591A%uFF0C%u5305%u62EC%uFF1A%0A-%20Polynomial%20kernel%0A-%20String%27kernel%0A-%20chiIsquare%27kernel%0A-%20histogram%20intersection%20kernel%2C%0A%0A%23%23Logistic%20regression%20vs.%20SVMs%0An%20%3D%20number%20of%20features%20%28%24x%20%5Cin%20R%5E%7Bn+1%7D%24%29%2C%20m%20%3D%20number%20of%20training%20examples%0A-%20If%20n%20is%20large%20%28relative%20to%20m%29%3A%20Use%20logistic%20regression%2C%20or%20SVM%20without%20a%20kernel%20%28%u201Clinear%20kernel%u201D%29.%0A-%20If%20n%20is%20small%2C%20m%20is%20intermediate%3A%20Use%20SVM%20with%20Gaussian%20kernel.%0A-%20If%20n%20is%20small%2C%20m%20is%20large%3A%20Create/add%20more%20features%2C%20then%20use%20logistic%20regression%20or%20SVM%20without%20a%20kernel.%0A-%20Neural%20network%20likely%20to%20work%20well%20for%20most%20of%20these%20settings%2C%20but%20may%20be%20slower%20to%20train.%0A%0A%23%23%20Recommendation%20for%20SVM%20implemetation%0A%5BLIBSVM%5D%28https%3A//www.csie.ntu.edu.tw/%7Ecjlin/libsvm/%29%0A%0A%0A%0A

Edit

Model

: activation of unit i in layer j
: matrix of weights controlling function mapping from layer j to layer j+1

这是一个三层神经网络。每一层都是遵循Logistic Regression的sigmoid function,所以是非线性的。


The sigmoid function is

通过hypothesis function 就可以计算cost function。公式在Machine Learning (1) - Linear & Logistic Regression
摘取公式如下:

Note:

  • 的尺寸为,取决于上一层和这一层的node数量。其中的+1,是bias节点,即常量(+1)节点,而不算在节点数中
  • 矩阵的数量取决于网络层数,即3层网络,只有2个矩阵

Cost Function

接着上一章的公式,给出完整的多状态分类并且正规化的cost function如下:

稍作解释:

  • 公式的前半部分是原版的hypothesis,并对所有的分类进行累加
  • 后半部分是正规化参数,是对所有的参数进行平方累加。一共有L-1层神经网络,每一层有


注意Regularized项,只累加项,不包含bias项

Backpropagation Algorithm

有了cost function,就可以开始Gradient descent了,基本公式如下:

但是现在引入了神经网络,求偏导数就没这么简单了,所以引入了Forwardpropagation & Backpropagation。意即,向前算一遍,再向回算一遍,综合两方面的数据就能得出cost function的偏导数。证明不会,但是Andrew给出了公式。
整个算法过程如下,这里不同于教材中的4层神经网络,而是采用本篇笔记中的三层神经网络:

Given training set
Set for all (l,i,j)。 size() = ,与
For i =1 to m
   1. Set , both a and x are vectors
   2. Perform forward propagation to compute for l=2,3,…,L. L=神经网络层数
         ——注意加入bias项
        
         ——注意加入bias项
        
        

   3. Using , compute . y是对输出的量测,是对输出的模型估计,算是模型偏差。
   4. Using backpropagation to compute ,输入层没有,即
     ——此处应当有误,具体参考相关笔记

注:
针对的计算要特别注意。下面专门放一节解释计算

   5. Compute
    
    Vectorized equation: ,其中j是每一层网络中的节点编号
注:
此处累加指的是针对training samples

   6. Final step, compute the partial derivative of
    
    
    

整个过程略显复杂,但是这就是神经网络。据说后面有更优雅的算法。

计算

: an error item that measures how much the node was responsible for any errors in our output.



最终计算式应为:

g=sigmoid function.

算法调优

算法偏差过大有两类问题:

  • Overfitting (过拟合 / High Variance)
  • Underfitting (欠拟合 / High Bias)
    针对这两个问题,我们要做的是:
    1. 识别他们
    2. 采取相应措施

如何产生正确的模型

首先将training samples分成三部分:

  • Training set: 60%
  • Cross validation set: 20%
  • Test set: 20%
    用training set来挑选,用Cross validation set来选择多项式幂次,最后用test set error 来评估算法的优劣。

Polynomial Degree - d

参考如下图像,

对于相同的,当多项式幂次很低时,可能会产生Underfitting,此时都很大
当多项式幂次很大时,可能会产生Overfitting,此时远大于,这对挑选d有很好的参考意义

Regularization -

  1. Create a list of lambdas (i.e. λ∈{0,0.01,0.02,0.04,…10.24});
  2. 计算不包含的train error和cross validation error
  3. 画出下图
  4. 和选择d类似,选取合适的

Random initialization

的初始值进行随机化。在ex4的练习中,给出了一下公式对进行初始化:


识别Overfitting(High Variance)/Underfitting(High Bias)

要识别,就要通过Learning Curves

High Bias:

High Variance:

What to try next?

  • Getting more training examples: Fixes high variance
  • Trying smaller sets of features: Fixes high variance
  • Adding features: Fixes high bias
  • Adding polynomial features: Fixes high bias
  • Decreasing λ: Fixes high bias
  • Increasing λ: Fixes high variance.

还有一个办法就是:
如果想得到好算法,做到两方面即可:

  1. 引入很多变量,参数 => high variance, low bias
  2. 提供大量training samples => low variance

Error Analysis

  • Start with a simple algorithm, implement it quickly, and test it early on your cross validation data.
  • Plot learning curves to decide if more data, more features, etc. are likely to help.
  • Manually examine the errors on examples in the cross validation set and try to spot a trend where most of the errors were made.

特例skewed data

当处理分类器时,如果有一个分类占绝对多数,例如99%。那可能我们对预测有偏向,比如要预测癌症,宁可误诊,不可漏诊,也就是要高Recall。

命中率

可以通过取不同的threshold,来做出下面这张图,根据Recall和Precision的偏好来选择需要的threshold。

如何评价算法优劣:
-Score (F Score) =

%23Machine%20Learning%20%282%29%20-%20Neural%20Network%0A%0A@%28myblog%29%5B%u6DF1%u5EA6%u5B66%u4E60%2C%20deep%20learning%5D%0A%0A%23%23Model%0A%21%5BAlt%20text%5D%28./1491893421634.png%29%0A%0A%24a_i%5E%7B%28j%29%7D%24%3A%20**activation**%20of%20unit%20i%20in%20layer%20j%0A%24%5CTheta%5E%7B%28j%29%7D%24%3A%20matrix%20of%20**weights**%20controlling%20function%20mapping%20from%20layer%20***j***%20to%20layer%20***j+1***%0A%0A%u8FD9%u662F%u4E00%u4E2A%u4E09%u5C42%u795E%u7ECF%u7F51%u7EDC%u3002%u6BCF%u4E00%u5C42%u90FD%u662F%u9075%u5FAALogistic%20Regression%u7684sigmoid%20function%uFF0C%u6240%u4EE5%u662F%u975E%u7EBF%u6027%u7684%u3002%20%0A%24a_1%5E%7B%282%29%7D%20%3D%20g%28%5CTheta_%7B10%7D%5E%7B%281%29%7D%20+%20%5CTheta_%7B11%7D%5E%7B%281%29%7Dx_1%20+%20%5CTheta_%7B12%7D%5E%7B%281%29%7Dx_2%20+%20%5CTheta_%7B13%7D%5E%7B%281%29%7Dx_3%29%20%24%0A%24a_2%5E%7B%282%29%7D%20%3D%20g%28%5CTheta_%7B20%7D%5E%7B%281%29%7D%20+%20%5CTheta_%7B21%7D%5E%7B%281%29%7Dx_1%20+%20%5CTheta_%7B22%7D%5E%7B%281%29%7Dx_2%20+%20%5CTheta_%7B13%7D%5E%7B%281%29%7Dx_3%29%20%24%0A%24a_3%5E%7B%282%29%7D%20%3D%20g%28%5CTheta_%7B30%7D%5E%7B%281%29%7D%20+%20%5CTheta_%7B31%7D%5E%7B%281%29%7Dx_1%20+%20%5CTheta_%7B32%7D%5E%7B%281%29%7Dx_2%20+%20%5CTheta_%7B13%7D%5E%7B%281%29%7Dx_3%29%20%24%0A%0AThe%20sigmoid%20function%20is%20%24g%28z%29%20%3D%20%5Cdfrac%20%7B1%7D%7B1+e%5E%7B-z%7D%7D%24%0A%0A%24%24h_%5CTheta%28x%29%20%3D%20a_1%5E%7B%283%29%7D%20%3D%20g%28%5CTheta_%7B10%7D%5E%7B%282%29%7Da_0%5E%7B%282%29%7D%20+%20%5CTheta_%7B11%7D%5E%7B%282%29%7Da_1%5E%7B2%7D%20+%20%5CTheta_%7B12%7D%5E%7B%282%29%7Da_2%5E%7B%282%29%7D%20+%20%5CTheta_%7B13%7D%5E%7B%282%29%7Da_3%5E%7B%282%29%7D%20%29%24%24%0A%0A%u901A%u8FC7hypothesis%20function%20%24h_%5Ctheta%28x%29%24%u5C31%u53EF%u4EE5%u8BA1%u7B97cost%20function%u3002%u516C%u5F0F%u5728%5BMachine%20Learning%20%281%29%20-%20Linear%20%26%20Logistic%20Regression%5D%28https%3A//app.yinxiang.com/shard/s10/nl/161681/b780cf32-b71b-44f8-929a-f91664ca0bc9%29%0A%u6458%u53D6%u516C%u5F0F%u5982%u4E0B%uFF1A%0A%24%24J%28%5Ctheta%29%20%3D%20-%5Cdfrac%20%7B1%7D%7Bm%7D%5Csum_%7Bi%3D1%7D%5Em%5By%5E%7B%28i%29%7Dlog%28h_%5Ctheta%28x%5E%7B%28i%29%7D%29%29%20+%20%281-y%5E%7B%28i%29%7D%29log%281-h_%5Ctheta%28x%5E%7B%28i%29%7D%29%29%5D%24%24%0A%0ANote%uFF1A%0A*%20%24%5CTheta%5E%7B%28j%29%7D%24%u7684%u5C3A%u5BF8%u4E3A%24s_%7Bj+1%7D%20%20%5Ctimes%20%28s_j+1%29%24%uFF0C%u53D6%u51B3%u4E8E%u4E0A%u4E00%u5C42%u548C%u8FD9%u4E00%u5C42%u7684node%u6570%u91CF%u3002%u5176%u4E2D%u7684+1%uFF0C%u662Fbias%u8282%u70B9%uFF0C%u5373%u5E38%u91CF%28+1%29%u8282%u70B9%uFF0C%u800C%u4E0D%u7B97%u5728%u8282%u70B9%u6570%u4E2D%0A*%20%24%5CTheta%24%u77E9%u9635%u7684%u6570%u91CF%u53D6%u51B3%u4E8E%u7F51%u7EDC%u5C42%u6570%uFF0C%u53733%u5C42%u7F51%u7EDC%uFF0C%u53EA%u67092%u4E2A%24%5CTheta%24%u77E9%u9635%0A%0A%23%23Cost%20Function%0A%u63A5%u7740%u4E0A%u4E00%u7AE0%u7684%u516C%u5F0F%uFF0C%u7ED9%u51FA%u5B8C%u6574%u7684%u591A%u72B6%u6001%u5206%u7C7B%u5E76%u4E14%u6B63%u89C4%u5316%u7684cost%20function%u5982%u4E0B%uFF1A%0A%24%24J%28%5CTheta%29%20%3D%20-%5Cdfrac%20%7B1%7D%7Bm%7D%20%5Csum_%7Bi%3D1%7D%5Em%20%5Csum_%7Bk%3D1%7D%5EK%5By_k%5E%7B%28i%29%7Dlog%28h_%5CTheta%28x%5E%7B%28i%29%7D%29%29_k%20+%20%281-y_k%5E%7B%28i%29%7D%29log%281-%28h_%5CTheta%28x%5E%7B%28i%29%7D%29%29_k%29%5D%20+%20%5Cdfrac%20%7B%5Clambda%7D%7B2m%7D%20%5Csum_%7Bl%3D1%7D%5E%7BL-1%7D%20%5Csum_%7Bi%3D1%7D%5E%7Bs_l%7D%20%5Csum_%7Bj%3D1%7D%5E%7Bs_l+1%7D%28%5CTheta_%7Bi%2Cj%7D%5E%7B%28l%29%7D%29%5E2%24%24%0A%u7A0D%u4F5C%u89E3%u91CA%uFF1A%0A*%20%u516C%u5F0F%u7684%u524D%u534A%u90E8%u5206%u662F%u539F%u7248%u7684hypothesis%uFF0C%u5E76%u5BF9%u6240%u6709%u7684%u5206%u7C7B%u8FDB%u884C%u7D2F%u52A0%0A*%20%u540E%u534A%u90E8%u5206%u662F%u6B63%u89C4%u5316%u53C2%u6570%uFF0C%u662F%u5BF9%u6240%u6709%u7684%24%5CTheta%24%u53C2%u6570%u8FDB%u884C%u5E73%u65B9%u7D2F%u52A0%u3002%u4E00%u5171%u6709L-1%u5C42%u795E%u7ECF%u7F51%u7EDC%uFF0C%u6BCF%u4E00%u5C42%u6709%24s_%7Bl+1%7D%5Ctimes%20s_l+1%24%u4E2A%24%5CTheta%24%0A%0A%3E**%u6CE8**%0A%3E%u6CE8%u610FRegularized%u9879%uFF0C%u53EA%u7D2F%u52A0%24s_%7Bj+1%7D%20%5Ctimes%20s_j%24%u9879%uFF0C%u4E0D%u5305%u542Bbias%u9879%24%5CTheta_%7Bi0%7D%24%0A%0A%23%23%23Backpropagation%20Algorithm%0A%u6709%u4E86cost%20function%uFF0C%u5C31%u53EF%u4EE5%u5F00%u59CBGradient%20descent%u4E86%uFF0C%u57FA%u672C%u516C%u5F0F%u5982%u4E0B%uFF1A%0A%24%24%5Ctheta_j%20%3A%3D%20%5Ctheta_j%20-%20%5Cdfrac%20%7B%5Calpha%7D%7Bm%7D%5Csum_%7Bi%3D1%7D%5Em%28h_%5Ctheta%28x%5E%7B%28i%29%7D%29-y%5E%7B%28i%29%7D%29%5Ccenterdot%20x_j%5E%7B%28i%29%7D%24%24%0A%u4F46%u662F%u73B0%u5728%u5F15%u5165%u4E86%u795E%u7ECF%u7F51%u7EDC%uFF0C%u6C42%u504F%u5BFC%u6570%u5C31%u6CA1%u8FD9%u4E48%u7B80%u5355%u4E86%uFF0C%u6240%u4EE5%u5F15%u5165%u4E86Forwardpropagation%20%26%20Backpropagation%u3002%u610F%u5373%uFF0C%u5411%u524D%u7B97%u4E00%u904D%uFF0C%u518D%u5411%u56DE%u7B97%u4E00%u904D%uFF0C%u7EFC%u5408%u4E24%u65B9%u9762%u7684%u6570%u636E%u5C31%u80FD%u5F97%u51FAcost%20function%u7684%u504F%u5BFC%u6570%u3002%u8BC1%u660E%u4E0D%u4F1A%uFF0C%u4F46%u662FAndrew%u7ED9%u51FA%u4E86%u516C%u5F0F%u3002%0A%u6574%u4E2A%u7B97%u6CD5%u8FC7%u7A0B%u5982%u4E0B%uFF0C%u8FD9%u91CC%u4E0D%u540C%u4E8E%u6559%u6750%u4E2D%u76844%u5C42%u795E%u7ECF%u7F51%u7EDC%uFF0C%u800C%u662F%u91C7%u7528%u672C%u7BC7%u7B14%u8BB0%u4E2D%u7684%u4E09%u5C42%u795E%u7ECF%u7F51%u7EDC%uFF1A%0A%3E%20Given%20training%20set%20%24%7B%28x%5E%7B%281%29%7D%2C%20y%5E%7B%281%29%7D%29%2C%20...%2C%20%28x%5E%7B%28m%29%7D%2C%20y%5E%7B%28m%29%7D%29%7D%24%0A%3E%20Set%20%24%5CDelta_%7Bij%7D%5E%7B%28l%29%7D%20%3D%200%20%24%20for%20all%20%28l%2Ci%2Cj%29%u3002%20size%28%24%5CDelta%24%29%20%3D%20%24s_%7Bj+1%7D%20%5Ctimes%20s_%7Bj%7D+1%24%uFF0C%u4E0E%24%5CTheta%24%u540C%0A%3E%20For%20i%20%3D1%20to%20m%0A%09%26nbsp%3B%26nbsp%3B%201.%20Set%20%24a%5E%7B%281%29%7D%20%3D%20x%5E%7B%28i%29%7D%24%2C%20both%20a%20and%20x%20are%20vectors%0A%09%26nbsp%3B%26nbsp%3B%202.%20Perform%20forward%20propagation%20to%20compute%20%24a%5E%7B%28l%29%7D%24%20for%20l%3D2%2C3%2C...%2CL.%20L%3D%u795E%u7ECF%u7F51%u7EDC%u5C42%u6570%0A%09%26nbsp%3B%26nbsp%3B%26nbsp%3B%26nbsp%3B%26nbsp%3B%26nbsp%3B%26nbsp%3B%26nbsp%3B%24a%5E%7B%281%29%7D%20%3D%20%5Cleft%5B%5Cbegin%7Bmatrix%7D%201%20%26%20x%20%5Cend%7Bmatrix%7D%20%5Cright%5D%24%20%u2014%u2014%u6CE8%u610F%u52A0%u5165bias%u9879%0A%09%26nbsp%3B%26nbsp%3B%26nbsp%3B%26nbsp%3B%26nbsp%3B%26nbsp%3B%26nbsp%3B%26nbsp%3B%24z%5E%7B%282%29%7D%20%3D%20%5CTheta%5E%7B%281%29%7Da%5E%7B%281%29%7D%24%0A%09%26nbsp%3B%26nbsp%3B%26nbsp%3B%26nbsp%3B%26nbsp%3B%26nbsp%3B%26nbsp%3B%26nbsp%3B%24a%5E%7B%282%29%7D%20%3D%20%5Cleft%5B%5Cbegin%7Bmatrix%7D%201%20%26%20g%28z%5E%7B%282%29%7D%29%5Cend%7Bmatrix%7D%20%5Cright%5D%24%20%u2014%u2014%u6CE8%u610F%u52A0%u5165bias%u9879%0A%09%26nbsp%3B%26nbsp%3B%26nbsp%3B%26nbsp%3B%26nbsp%3B%26nbsp%3B%26nbsp%3B%26nbsp%3B%24z%5E%7B%283%29%7D%20%3D%20%5CTheta%5E%7B%282%29%7Da%5E%7B%282%29%7D%24%0A%09%26nbsp%3B%26nbsp%3B%26nbsp%3B%26nbsp%3B%26nbsp%3B%26nbsp%3B%26nbsp%3B%26nbsp%3B%24a%5E%7B%283%29%7D%20%3D%20h_%5CTheta%28x%29%20%3D%20g%28z%5E%7B%283%29%7D%29%24%0A%09%3E%0A%20%20%20%26nbsp%3B%26nbsp%3B%203.%20Using%20%24y%5E%7B%28i%29%7D%24%2C%20compute%20%24%5Cdelta%5E%7B%28L%29%7D%20%3D%20a%5E%7B%28L%29%7D-%20y%20%5E%7B%28i%29%7D%24.%20y%u662F%u5BF9%u8F93%u51FA%u7684%u91CF%u6D4B%uFF0C%24a%5E%7B%28L%29%7D%20%3D%20h_%5CTheta%28x%29%24%u662F%u5BF9%u8F93%u51FA%u7684%u6A21%u578B%u4F30%u8BA1%uFF0C%24%5Cdelta%5E%7B%28L-1%29%7D%24%u7B97%u662F%u6A21%u578B%u504F%u5DEE%u3002%0A%20%20%20%26nbsp%3B%26nbsp%3B%204.%20Using%20backpropagation%20to%20compute%20%24%5Cdelta%5E%7B%28L-1%29%7D%2C%20%5Cdelta%5E%7B%28L-2%29%7D%2C%20...%2C%20%5Cdelta%5E%7B%282%29%7D%24%uFF0C%u8F93%u5165%u5C42%u6CA1%u6709%24%5Cdelta%24%uFF0C%u5373%24%5Cdelta%5E%7B%281%29%7D%24%0A%20%20%26nbsp%3B%26nbsp%3B%26nbsp%3B%26nbsp%3B%24%5Cdelta%5E%7B%28l%29%7D%3D%28%5CTheta%5E%7B%28l%29%7D%29%5ET%5Cdelta%5E%7B%28l+1%29%7D.*a%5E%7B%28l%29%7D.*%281-a%5E%7B%28l%29%7D%29%24%20%u2014%u2014%u6B64%u5904%u5E94%u5F53%u6709%u8BEF%uFF0C%u5177%u4F53%u53C2%u8003%24%5Cdelta%24%u76F8%u5173%u7B14%u8BB0%0A%20%20%0A%20%3E**%u6CE8%3A**%0A*%u9488%u5BF9%24%5Cdelta%24%u7684%u8BA1%u7B97%u8981%u7279%u522B%u6CE8%u610F%u3002%u4E0B%u9762%u4E13%u95E8%u653E%u4E00%u8282%u89E3%u91CA%24%5Cdelta%24%u8BA1%u7B97*%0A%20%20%20%20%0A%20%3E%26nbsp%3B%26nbsp%3B%205.%20Compute%20%24%5CDelta_%7Bij%7D%5E%7B%28l%29%7D%24%0A%20%26nbsp%3B%26nbsp%3B%26nbsp%3B%26nbsp%3B%24%5CDelta_%7Bij%7D%5E%7B%28l%29%7D%20%3A%3D%20%5CDelta_%7Bij%7D%5E%7B%28l%29%7D%20+%20a_j%5E%7B%28l%29%7D%5Cdelta_i%5E%7B%28l+1%29%7D%24%0A%20%26nbsp%3B%26nbsp%3B%26nbsp%3B%26nbsp%3BVectorized%20equation%3A%20%24%5CDelta%5E%7B%28l%29%7D%20%3A%3D%20%5CDelta%5E%7B%28l%29%7D%20+%20%5Cdelta%5E%7B%28l+1%29%7D%28a%5E%7B%28l%29%7D%29%5ET%24%uFF0C%u5176%u4E2Dj%u662F%u6BCF%u4E00%u5C42%u7F51%u7EDC%u4E2D%u7684%u8282%u70B9%u7F16%u53F7%0A%20%3E**%u6CE8%3A**%0A%3E*%u6B64%u5904%u7D2F%u52A0%u6307%u7684%u662F%u9488%u5BF9training%20samples*%0A%0A%3E%20%26nbsp%3B%26nbsp%3B%206.%20Final%20step%2C%20compute%20the%20partial%20derivative%20of%20%24J%28%5CTheta%29%24%0A%20%26nbsp%3B%26nbsp%3B%26nbsp%3B%26nbsp%3B%20%24%5Cdfrac%20%7B%5Cpartial%7D%7B%5Cpartial%5CTheta_%7Bij%7D%5E%7B%28l%29%7D%7DJ%28%5CTheta%29%20%3D%20D_%7Bij%7D%5E%7B%28l%29%7D%24%0A%20%26nbsp%3B%26nbsp%3B%26nbsp%3B%26nbsp%3B%20%24D_%7Bij%7D%5E%7B%28l%29%7D%20%3A%3D%20%5Cdfrac%20%7B1%7D%7Bm%7D%20%5CDelta_%7Bij%7D%5E%7B%28l%29%7D%20+%20%5Clambda%20%5CTheta_%7Bij%7D%5E%7B%28l%29%7D%20%5Chspace%7B1em%7D%20if%20%5Chspace%7B0.1em%7D%20j%20%5Cneq%200%24%20%20%0A%20%20%26nbsp%3B%26nbsp%3B%26nbsp%3B%26nbsp%3B%20%24D_%7Bij%7D%5E%7B%28l%29%7D%20%3A%3D%20%5Cdfrac%20%7B1%7D%7Bm%7D%20%5CDelta_%7Bij%7D%5E%7B%28l%29%7D%20%5Chspace%7B4em%7D%20if%20%5Chspace%7B0.1em%7D%20j%20%3D%200%24%0A%0A%u6574%u4E2A%u8FC7%u7A0B%u7565%u663E%u590D%u6742%uFF0C%u4F46%u662F%u8FD9%u5C31%u662F%u795E%u7ECF%u7F51%u7EDC%u3002%u636E%u8BF4%u540E%u9762%u6709%u66F4%u4F18%u96C5%u7684%u7B97%u6CD5%u3002%0A%0A%23%23%23%u8BA1%u7B97%24%5Cdelta%24%0A%3E%24%5Cdelta%24%3A%20an%20error%20item%20that%20measures%20how%20much%20the%20node%20was%20responsible%20for%20any%20errors%20in%20our%20output.%0A%0A%24cost%28t%29%20%3D%20y%5E%7B%28t%29%7Dlog%28h_%5CTheta%28x%5E%7B%28t%29%7D%29%29%20+%20%281-y%5E%7B%28t%29%7D%29log%281-h_%5CTheta%28x%5E%7B%28t%29%7D%29%29%24%0A%24%5Cdelta_j%5E%7B%28t%29%7D%20%3D%20%5Cdfrac%20%7B%5Cpartial%7D%7B%5Cpartial%20z_j%5E%7B%28l%29%7D%7Dcost%28t%29%20%3D%20%28%5CTheta%5E%7B%28l%29%7D%29%5ET%5Cdelta%5E%7B%28l+1%29%7D.*g%27%28z%29%20%24%0A%24g%27%28z%29%20%3D%20%28%5Cdfrac%20%7B1%7D%7B1+e%5E%7B-z%7D%7D%29%27%20%3D%20g%28z%29%281-g%28z%29%29%24%0A%0A%u6700%u7EC8%u8BA1%u7B97%u5F0F%u5E94%u4E3A%uFF1A%0A%24%24%5Cdelta_j%5E%7B%28t%29%7D%20%3D%20%28%5CTheta%5E%7B%28l%29%7D%29%5ET%5Cdelta%5E%7B%28l+1%29%7D.*g%28%5CTheta%5E%7B%28l-1%29%7D%20a%5E%7B%28l-1%29%7D%29%20.*%281-g%28%5CTheta%5E%7B%28l-1%29%7Da%5E%7B%28l-1%29%7D%29%29%24%24%0A%0Ag%3Dsigmoid%20function.%0A%20%20%0A%23%23%u7B97%u6CD5%u8C03%u4F18%0A%u7B97%u6CD5%u504F%u5DEE%u8FC7%u5927%u6709%u4E24%u7C7B%u95EE%u9898%uFF1A%0A-%20Overfitting%20%28%u8FC7%u62DF%u5408%20/%20High%20Variance%29%0A-%20Underfitting%20%28%u6B20%u62DF%u5408%20/%20High%20Bias%29%0A%u9488%u5BF9%u8FD9%u4E24%u4E2A%u95EE%u9898%uFF0C%u6211%u4EEC%u8981%u505A%u7684%u662F%uFF1A%0A1.%20%u8BC6%u522B%u4ED6%u4EEC%0A2.%20%u91C7%u53D6%u76F8%u5E94%u63AA%u65BD%0A%0A%23%23%23%u5982%u4F55%u4EA7%u751F%u6B63%u786E%u7684%u6A21%u578B%0A%u9996%u5148%u5C06training%20samples%u5206%u6210%u4E09%u90E8%u5206%uFF1A%0A-%20Training%20set%3A%2060%25%0A-%20Cross%20validation%20set%3A%2020%25%0A-%20Test%20set%3A%2020%25%0A%u7528training%20set%u6765%u6311%u9009%24%5CTheta%2C%20%5Clambda%24%uFF0C%u7528Cross%20validation%20set%u6765%u9009%u62E9%u591A%u9879%u5F0F%u5E42%u6B21%uFF0C%u6700%u540E%u7528test%20set%20error%20%24J_%7Btest%7D%28%5CTheta%5E%7B%28d%29%7D%29%24%u6765%u8BC4%u4F30%u7B97%u6CD5%u7684%u4F18%u52A3%u3002%0A%21%5BAlt%20text%7C400x0%5D%28./1502938070114.png%29%0A%0A%0A%23%23%23%23Polynomial%20Degree%20-%20d%0A%u53C2%u8003%u5982%u4E0B%u56FE%u50CF%uFF0C%0A%21%5BAlt%20text%5D%28./1493951705755.png%29%0A%0A%3E%u5BF9%u4E8E%u76F8%u540C%u7684%24%5CTheta%24%uFF0C%u5F53%u591A%u9879%u5F0F%u5E42%u6B21%u5F88%u4F4E%u65F6%uFF0C%u53EF%u80FD%u4F1A%u4EA7%u751FUnderfitting%uFF0C%u6B64%u65F6%24J_cv%24%u548C%24J_training%24%u90FD%u5F88%u5927%0A%u5F53%u591A%u9879%u5F0F%u5E42%u6B21%u5F88%u5927%u65F6%uFF0C%u53EF%u80FD%u4F1A%u4EA7%u751FOverfitting%uFF0C%u6B64%u65F6%24J_cv%24%u8FDC%u5927%u4E8E%24J_training%24%uFF0C%u8FD9%u5BF9%u6311%u9009d%u6709%u5F88%u597D%u7684%u53C2%u8003%u610F%u4E49%0A%0A%23%23%23%23Regularization%20%20-%20%24%5Clambda%24%0A1.%20Create%20a%20list%20of%20lambdas%20%28i.e.%20%u03BB%u2208%7B0%2C0.01%2C0.02%2C0.04%2C...10.24%7D%29%3B%0A2.%20%u8BA1%u7B97%u4E0D%u5305%u542B%24%5Clambda%24%u7684train%20error%u548Ccross%20validation%20error%0A3.%20%u753B%u51FA%u4E0B%u56FE%0A4.%20%u548C%u9009%u62E9d%u7C7B%u4F3C%uFF0C%u9009%u53D6%u5408%u9002%u7684%24%5Clambda%24%0A%0A%21%5BAlt%20text%7C350x0%5D%28./1493952692430.png%29%0A%0A%23%23%23%23Random%20initialization%0A%u5BF9%24%5CTheta%24%u7684%u521D%u59CB%u503C%u8FDB%u884C%u968F%u673A%u5316%u3002%u5728ex4%u7684%u7EC3%u4E60%u4E2D%uFF0C%u7ED9%u51FA%u4E86%u4E00%u4E0B%u516C%u5F0F%u5BF9%24%5CTheta%24%u8FDB%u884C%u521D%u59CB%u5316%uFF1A%0A%24%24%5Cepsilon_%7Binit%7D%20%3D%20%5Cdfrac%7B%5Csqrt%7B6%7D%7D%7B%5Csqrt%20%7BL%5C_in%20+L%5C_out%7D%7D%24%24%0A%24L%5C_in%20%3D%20s_l%24%0A%24L%5C_out%20%3D%20s_%7Bl+1%7D%24%0A%0A%23%23%23%u8BC6%u522BOverfitting%28High%20Variance%29/Underfitting%28High%20Bias%29%0A%u8981%u8BC6%u522B%uFF0C%u5C31%u8981%u901A%u8FC7Learning%20Curves%0A%0A**High%20Bias%3A**%0A%21%5BAlt%20text%7C350x0%5D%28./1493950678949.png%29%0ATest%20error%u548Ctrain%20error%u63A5%u8FD1%uFF0C%u800C%u4E14%u90FD%u5F88%u9AD8%0A%0A**High%20Variance%3A**%0A%21%5BAlt%20text%7C350x0%5D%28./1493952926446.png%29%0ATrain%20error%u63A5%u8FD1%u6B63%u786E%u503C%uFF0C%u4F46%u662Ftest%20error%u8FD8%u662F%u5F88%u9AD8%uFF0C%u968F%u7740training%20set%20size%u53D8%u5927%uFF0Ctest%20error%u4E5F%u5728%u51CF%u5C0F%uFF0C%u4F46%u662F%u51CF%u5C0F%u7684%u901F%u5EA6%u5F88%u6162%u3002%0A%0A%23%23%23What%20to%20try%20next%3F%0A%3E%20-%20Getting%20more%20training%20examples%3A%20Fixes%20high%20variance%0A%3E-%20Trying%20smaller%20sets%20of%20features%3A%20Fixes%20high%20variance%0A%3E-%20Adding%20features%3A%20Fixes%20high%20bias%0A%3E-%20Adding%20polynomial%20features%3A%20Fixes%20high%20bias%0A%3E-%20Decreasing%20%u03BB%3A%20Fixes%20high%20bias%0A%3E-%20Increasing%20%u03BB%3A%20Fixes%20high%20variance.%0A%0A%u8FD8%u6709%u4E00%u4E2A%u529E%u6CD5%u5C31%u662F%uFF1A%0A%u5982%u679C%u60F3%u5F97%u5230%u597D%u7B97%u6CD5%uFF0C%u505A%u5230%u4E24%u65B9%u9762%u5373%u53EF%uFF1A%0A1.%20%u5F15%u5165%u5F88%u591A%u53D8%u91CF%uFF0C%u53C2%u6570%20%3D%3E%20high%20variance%2C%20low%20bias%0A2.%20%u63D0%u4F9B%u5927%u91CFtraining%20samples%20%3D%3E%20low%20variance%0A%23%23%23Error%20Analysis%0A%3E-%20Start%20with%20a%20simple%20algorithm%2C%20implement%20it%20quickly%2C%20and%20test%20it%20early%20on%20your%20cross%20validation%20data.%0A%3E-%20Plot%20learning%20curves%20to%20decide%20if%20more%20data%2C%20more%20features%2C%20etc.%20are%20likely%20to%20help.%0A%3E-%20Manually%20examine%20the%20errors%20on%20examples%20in%20the%20cross%20validation%20set%20and%20try%20to%20spot%20a%20trend%20where%20most%20of%20the%20errors%20were%20made.%0A%0A%23%23%23%u7279%u4F8Bskewed%20data%0A%u5F53%u5904%u7406%u5206%u7C7B%u5668%u65F6%uFF0C%u5982%u679C%u6709%u4E00%u4E2A%u5206%u7C7B%u5360%u7EDD%u5BF9%u591A%u6570%uFF0C%u4F8B%u598299%25%u3002%u90A3%u53EF%u80FD%u6211%u4EEC%u5BF9%u9884%u6D4B%u6709%u504F%u5411%uFF0C%u6BD4%u5982%u8981%u9884%u6D4B%u764C%u75C7%uFF0C%u5B81%u53EF%u8BEF%u8BCA%uFF0C%u4E0D%u53EF%u6F0F%u8BCA%uFF0C%u4E5F%u5C31%u662F%u8981%u9AD8Recall%u3002%0A%21%5BAlt%20text%5D%28./1493953864306.png%29%0A%24Precision%20%3D%20%5Cdfrac%20%7Btrue%5C%20positive%7D%7Btrue%5C%20positive%20+%20false%5C%20positive%7D%24%20%u9884%u6D4B%u547D%u4E2D%u7684%u51C6%u786E%u6027%0A%0A%24Recall%20%3D%20%5Cdfrac%20%7Btrue%5C%20positive%7D%7Btrue%5C%20positive%20+%20false%5C%20negative%7D%24%20%u547D%u4E2D%u7387%0A%0A%24Accuracy%20%3D%20%5Cdfrac%20%7Btrue%5C%20positive%20+%20true%5C%20negative%7D%20%7Boverall%5C%20samples%7D%24%0A%0A%u53EF%u4EE5%u901A%u8FC7%u53D6%u4E0D%u540C%u7684threshold%uFF0C%u6765%u505A%u51FA%u4E0B%u9762%u8FD9%u5F20%u56FE%uFF0C%u6839%u636ERecall%u548CPrecision%u7684%u504F%u597D%u6765%u9009%u62E9%u9700%u8981%u7684threshold%u3002%0A%21%5BAlt%20text%5D%28./1493961624396.png%29%0A%0A**%u5982%u4F55%u8BC4%u4EF7%u7B97%u6CD5%u4F18%u52A3%uFF1A**%0A%24F1%24-Score%20%28F%20Score%29%20%3D%20%242%5Cdfrac%20%7BPR%7D%7BP+R%7D%24%0A%0A%0A

Edit

Linear Regression

Linear Regression = 线性回归。

Single Feature Hypothesis:

通过求cost function的最小值,来估算
列出单feature的cost function,不做多介绍。重点放在Multiple Feature Linear Regression。

Cost Function:

Cost Function & Gradient Descent

Multiple Feature Hypothesis:

Multiple Feature Cost Function:

Gradient Descent:

  • j := 0…m
  • - Learning rate.

How to choose learning rate - ?

  • If is too small: slow convergence
  • If is too large: J() may not decrease on every iteration; may not converge.

To choose , try:
…, 0.001, 0.003, 0.01, 0.03, 0.1, 0.3, 1, …
每次取3倍。

Feature Scaling

We can speed up gradient descent by having each of our input values in roughly the same range. This is because θ will descend quickly on small ranges and slowly on large ranges, and so will oscillate inefficiently down to the optimum when the variables are very uneven.

用下面的公式预处理我们的training set,就是Feature Scaling

  • is the average of all the values for feature (i)
  • is the range of values (max - min)
  • or is the standard deviation.

* standard deviation (标准差)=

Normal Equation正规解

  • 注1:training set的数量要大于feature数量,否则会不可逆,导致没有解
  • 注2:正规解不需要feature scaling

Logistic Regression

Hypothesis:

Sigmoid Function / Logistic Function

The following image shows us what the sigmoid function looks like:

这里的Hypothesis方程是不平滑的,会导致很多的local optimization,导致Gradient Descent不能成功收敛。

用条件概率表示Hypothesis:
y=1的条件下,x,的取值概率。

Cost Function



这其中用Cost函数取代了线性回归中的

将Cost函数代入整个Cost Function中可得:

Vectorized implementation:

进一步计算Gradient Descent的迭代算法为:

矢量化写法:

Overfitting过拟合

Regularized Linear Regression

The λ, or lambda, is the regularization parameter. It determines how much the costs of our theta parameters are inflated.

Regularized Logistic Regression

The second sum, means to explicitly exclude the bias term

!!注意:这里的regularized项,不包含,如果在Matlab/Octave中就是

%23Machine%20Learning%20%281%29%20-%20Linear%20%26%20Logistic%20Regression%0A%0A@%28%u5B66%u4E60%u7B14%u8BB0%29%0A%0A%0A%5BTOC%5D%0A%0A%21%5BAlt%20text%5D%28./1491028424509.png%29%0A%0A%23%23Linear%20Regression%0ALinear%20Regression%20%3D%20%u7EBF%u6027%u56DE%u5F52%u3002%0A%0ASingle%20Feature%20**Hypothesis**%3A%0A%24%24h_%7B%5Ctheta%7D%28x%29%20%20%3D%20%5Ctheta_%7B0%7D%20+%20%5Ctheta_%7B1%7Dx%24%24%0A%u901A%u8FC7%u6C42cost%20function%u7684%u6700%u5C0F%u503C%uFF0C%u6765%u4F30%u7B97%24%5Ctheta_%7Bi%7D%24%u3002%0A%u5217%u51FA%u5355feature%u7684cost%20function%uFF0C%u4E0D%u505A%u591A%u4ECB%u7ECD%u3002%u91CD%u70B9%u653E%u5728Multiple%20Feature%20Linear%20Regression%u3002%0A%0A**Cost%20Function**%3A%0A%24%24J%28%5Ctheta%29%20%3D%20%5Cdfrac%20%7B1%7D%7B2m%7D%20%5Csum_%7Bi%3D1%7D%5Em%20%28h_%5Ctheta%28x_%7Bi%7D%29%20-%20y_%7Bi%7D%29%5E2%24%24%0A%0A%23%23%23Cost%20Function%20%26%20Gradient%20Descent%0AMultiple%20Feature%20**Hypothesis**%3A%0A%24%24h_%5Ctheta%28x%29%20%3D%20%5Cbegin%20%7Bbmatrix%7D%5Ctheta_0%20%5Chspace%7B1em%7D%20%5Ctheta_1%20%5Chspace%7B1em%7D...%20%5Chspace%7B1em%7D%5Ctheta_n%20%5Cend%7Bbmatrix%7D%20%5Cbegin%7Bbmatrix%7Dx_0%20%5Cnewline%20x_1%20%5Cnewline%20%5Cvdots%20%5Cnewline%20x_n%5Cend%7Bbmatrix%7D%20%3D%20%5Ctheta%5ETx%24%24%0A%0AMultiple%20Feature%20**Cost%20Function**%3A%0A%24%24J%28%5Ctheta%29%20%3D%20%5Cdfrac%20%7B1%7D%7B2m%7D%5Csum_%7Bi%3D1%7D%5Em%28h_%5Ctheta%28x%5E%7B%28i%29%7D%29-y%5E%7B%28i%29%7D%29%5E2%24%24%0A%0AGradient%20Descent%3A%0A%24%24%5Ctheta_j%20%3A%3D%20%5Ctheta_j%20-%20%5Calpha%5Cdfrac%20%7B%5Cpartial%7D%7B%5Cpartial%5Ctheta_j%7D%20J%28%5Ctheta%29%20%3D%20%5Ctheta_j%20-%20%5Calpha%5Cdfrac%20%7B1%7D%7Bm%7D%5Csum_%7Bi%3D1%7D%5Em%28h_%5Ctheta%28x%5E%7B%28i%29%7D%29%20-%20y%5E%7B%28i%29%7D%29%5Ccenterdot%20x_j%5E%7B%28i%29%7D%24%24%20%0A-%20j%20%3A%3D%200...m%0A-%20%24%5Calpha%24%20-%20Learning%20rate.%0A%0A%23%23%23How%20to%20choose%20learning%20rate%20-%20%24%5Calpha%24%20%3F%0A-%20If%20%24%5Calpha%24%20is%20too%20small%3A%20slow%20convergence%0A-%20If%20%24%5Calpha%24%20is%20too%20large%3A%20J%28%24%5Ctheta%24%29%20may%20not%20decrease%20on%20every%20iteration%3B%20may%20not%20converge.%0A%0ATo%20choose%20%24%5Calpha%24%2C%20try%3A%0A%09...%2C%200.001%2C%200.003%2C%200.01%2C%200.03%2C%200.1%2C%200.3%2C%201%2C%20...%0A%u6BCF%u6B21%u53D63%u500D%u3002%0A%0A%23%23%23Feature%20Scaling%0A%3EWe%20can%20speed%20up%20gradient%20descent%20by%20having%20each%20of%20our%20input%20values%20in%20**roughly%20the%20same%20range**.%20This%20is%20because%20%u03B8%20will%20descend%20quickly%20on%20small%20ranges%20and%20slowly%20on%20large%20ranges%2C%20and%20so%20will%20oscillate%20inefficiently%20down%20to%20the%20optimum%20when%20the%20variables%20are%20very%20uneven.%0A%0A%u7528%u4E0B%u9762%u7684%u516C%u5F0F%u9884%u5904%u7406%u6211%u4EEC%u7684training%20set%uFF0C%u5C31%u662FFeature%20Scaling%0A%24%24x_i%20%3A%3D%20%5Cdfrac%7Bx_i%20-%20%5Cmu_i%7D%7Bs_i%7D%20%24%24%0A-%20%24%5Cmu_i%24%20is%20the%20average%20of%20all%20the%20values%20for%20feature%20%28i%29%0A-%20%24s_i%24%20is%20the%20range%20of%20values%20%28max%20-%20min%29%0A-%20**or**%20%24s_i%24%20is%20the%20standard%20deviation.%0A%0A%5C*%20*standard%20deviation%20%28%u6807%u51C6%u5DEE%29%3D%20%24%5Csqrt%7B%5Cdfrac%7B1%7D%7BN%7D%5Csum_%7Bi%3D1%7D%5EN%28x_i%20-%20%5Cmu%292%7D%24*%0A%0A%23%23%23Normal%20Equation%u6B63%u89C4%u89E3%0A%24%24%5Ctheta%20%3D%20%28X%5ETX%29%5E%7B-1%7DX%5ETy%24%24%0A*%20%u6CE81%uFF1Atraining%20set%u7684%u6570%u91CF%u8981%u5927%u4E8Efeature%u6570%u91CF%uFF0C%u5426%u5219%24X%5ETX%24%u4F1A%u4E0D%u53EF%u9006%uFF0C%u5BFC%u81F4%u6CA1%u6709%u89E3%0A*%20%u6CE82%uFF1A%u6B63%u89C4%u89E3%u4E0D%u9700%u8981feature%20scaling%0A%0A%23%23Logistic%20Regression%0A**Hypothesis**%3A%0A%24%24h_%5Ctheta%28x%29%20%3D%20g%28%5Ctheta%5ETx%29%24%24%0A**Sigmoid%20Function%20/%20Logistic%20Function**%0A%24%24g%28z%29%20%3D%20%5Cdfrac%20%7B1%7D%7B1+e%5E%7B-z%7D%7D%24%24%0A%24z%20%3D%20%5Ctheta%5ETx%24%0A%0AThe%20following%20image%20shows%20us%20what%20the%20sigmoid%20function%20looks%20like%3A%0A%21%5BAlt%20text%5D%28./1491033397652.png%29%0A%0A%u8FD9%u91CC%u7684Hypothesis%u65B9%u7A0B%u662F%u4E0D%u5E73%u6ED1%u7684%uFF0C%u4F1A%u5BFC%u81F4%u5F88%u591A%u7684local%20optimization%uFF0C%u5BFC%u81F4Gradient%20Descent%u4E0D%u80FD%u6210%u529F%u6536%u655B%u3002%0A%0A%u7528%u6761%u4EF6%u6982%u7387%u8868%u793AHypothesis%3A%0Ay%3D1%u7684%u6761%u4EF6%u4E0B%uFF0Cx%2C%24%5Ctheta%24%u7684%u53D6%u503C%u6982%u7387%u3002%0A%24%24h_%5Ctheta%28x%29%3DP%28y%3D1%5Cmid%20x%3B%u03B8%29%3D1%u2212P%28y%3D0%20%5Cmid%20x%3B%u03B8%29%24%24%0A%24%24P%28y%3D0%20%5Cmid%20x%3B%u03B8%29+P%28y%3D1%20%5Cmid%20x%3B%u03B8%29%3D1%24%24%0A%0A%0A%0A%23%23%23%20Cost%20Function%0A%24%24J%28%5Ctheta%29%20%3D%20-%5Cdfrac%20%7B1%7D%7Bm%7D%20%5Csum_%7Bi%3D1%7D%5EmCost%28h_%5Ctheta%28x%5E%7B%28i%29%7D%29%2C%20y%5E%7B%28i%29%7D%29%24%24%0A%24%24Cost%28h_%5Ctheta%28x%29%2C%20y%29%29%20%3D%20-log%28h_%5Ctheta%28x%29%29%20%20%5Chspace%7B10em%7Dif%20y%20%3D%201%24%24%0A%24%24Cost%28h_%5Ctheta%28x%29%2C%20y%29%29%20%3D%20-log%281-h_%5Ctheta%28x%29%29%20%5Chspace%7B8em%7D%20if%20y%20%3D%200%24%24%0A%0A%24Cost%28h_%5Ctheta%28x%29%2C%20y%29%29%20%3D%200%20%5Chspace%7B3em%7D%20if%20%5Chspace%7B1em%7D%20h_%5Ctheta%28x%29%20%3D%20y%24%0A%24Cost%28h_%5Ctheta%28x%29%2C%20y%29%29%20%5Crightarrow%20%5Cinfty%20%5Chspace%7B2em%7D%20if%20y%3D0%20%5Chspace%7B0.5em%7D%20and%20%5Chspace%7B0.5em%7D%20h_%5Ctheta%28x%29%20%5Crightarrow%201%24%0A%24Cost%28h_%5Ctheta%28x%29%2C%20y%29%29%20%5Crightarrow%20%5Cinfty%20%5Chspace%7B2em%7D%20if%20y%3D1%20%5Chspace%7B0.5em%7D%20and%20%5Chspace%7B0.5em%7D%20h_%5Ctheta%28x%29%20%5Crightarrow%200%24%0A%0A%u8FD9%u5176%u4E2D%u7528Cost%u51FD%u6570%u53D6%u4EE3%u4E86%u7EBF%u6027%u56DE%u5F52%u4E2D%u7684%24%5Cdfrac%20%7B1%7D%7B2%7D%28h_%5Ctheta%28x_i%29-y_i%29%5E2%24%0A%0A%u5C06Cost%u51FD%u6570%u4EE3%u5165%u6574%u4E2ACost%20Function%u4E2D%u53EF%u5F97%uFF1A%0A%24J%28%5Ctheta%29%20%3D%20-%5Cdfrac%20%7B1%7D%7Bm%7D%5Csum_%7Bi%3D1%7D%5Em%5By%5E%7B%28i%29%7Dlog%28h_%5Ctheta%28x%5E%7B%28i%29%7D%29%29%20+%20%281-y%5E%7B%28i%29%7D%29log%281-h_%5Ctheta%28x%5E%7B%28i%29%7D%29%29%5D%24%0A%0A**Vectorized%20implementation**%3A%0A%24h%3Dg%28X%5Ctheta%29%20%3D%20%5Cdfrac%20%7B1%7D%7B1+e%5E%7B-%5Ctheta%5ETx%7D%7D%24%0A%24J%28%5Ctheta%29%20%3D%20%5Cdfrac%20%7B1%7D%7Bm%7D%5Ccenterdot%20%28-y%5ETlog%28h_%5Ctheta%29-%281-y%29%5ETlog%281-h_%5Ctheta%29%29%24%0A%0A%u8FDB%u4E00%u6B65%u8BA1%u7B97Gradient%20Descent%u7684%u8FED%u4EE3%u7B97%u6CD5%u4E3A%uFF1A%0A%24%24%5Ctheta_j%20%3A%3D%20%5Ctheta_j%20-%20%5Cdfrac%20%7B%5Calpha%7D%7Bm%7D%5Csum_%7Bi%3D1%7D%5Em%28h_%5Ctheta%28x%5E%7B%28i%29%7D%29-y%5E%7B%28i%29%7D%29%5Ccenterdot%20x_j%5E%7B%28i%29%7D%24%24%0A%0A%u77E2%u91CF%u5316%u5199%u6CD5%uFF1A%0A%24%24%5Ctheta%20%3A%3D%20%5Ctheta%20-%20%5Cdfrac%7B%5Calpha%7D%7Bm%7DX%5ET%28g%28X%5Ctheta%29%20-%20%5Cvec%7By%7D%29%24%24%0A%0A%23%23Overfitting%u8FC7%u62DF%u5408%0A%23%23%23Regularized%20Linear%20Regression%0A%24%24J%28%5Ctheta%29%20%3D%20%5Cdfrac%7B1%7D%7B2m%7D%5Csum_%7Bi%3D1%7D%5Em%28h_%5Ctheta%28x%5E%7B%28i%29%7D%29-y%5E%7Bi%7D%29%5E2+%5Clambda%5Csum_%7Bj%3D1%7D%5En%5Ctheta_j%5E2%24%24%0A%3EThe%20%u03BB%2C%20or%20lambda%2C%20is%20the%20regularization%20parameter.%20It%20determines%20how%20much%20the%20costs%20of%20our%20theta%20parameters%20are%20inflated.%0A%23%23%23Regularized%20Logistic%20Regression%0A%24%24J%28%5Ctheta%29%20%3D%20-%5Cdfrac%20%7B1%7D%7Bm%7D%5Csum_%7Bi%3D1%7D%5Em%5By%5E%7B%28i%29%7Dlog%28h_%5Ctheta%28x%5E%7B%28i%29%7D%29%29%20+%20%281-y%5E%7B%28i%29%7D%29log%281-h_%5Ctheta%28x%5E%7B%28i%29%7D%29%29%5D%20+%20%5Cdfrac%20%7B%5Clambda%7D%7B2m%7D%5Csum_%7Bj%3D1%7D%5En%5Ctheta_j%5E2%24%24%0AThe%20second%20sum%2C%24%5Csum_%7Bj%3D1%7D%5En%5Ctheta_j%5E2%24%20means%20to%20**explicitly%20exclude%20the%20bias%20term**%20%24%5Ctheta_0%24%0A%0A%21%21%u6CE8%u610F%uFF1A%u8FD9%u91CC%u7684regularized%u9879%uFF0C%u4E0D%u5305%u542B%24%5Ctheta_0%24%uFF0C%u5982%u679C%u5728Matlab/Octave%u4E2D%u5C31%u662F%24%5Ctheta_1%24%u3002

Edit


本篇从这里开始
跟我一起学习VIM - The Life Changing Editor
然后到这里进入高潮
spf13/spf13-vim
第一篇看似是一篇普普通通介绍Vim插件的博客文。但是他在介绍一些顶级插件的同时,引入了spf13这个GitHub上的开源项目。这其实就是一个很全的vimrc配置。玩家可以根据自己的需求进行定制。这一下打开了潘多拉魔盒,本来怕麻烦折腾的我,一下就入坑了。

" Use fork vimrc if available {
if filereadable(expand("~/.vimrc.fork"))
source ~/.vimrc.fork
endif
" }

" Use local vimrc if available {
if filereadable(expand("~/.vimrc.local"))
source ~/.vimrc.local
endif
" }

" Use local gvimrc if available and gui is running {
if has('gui_running')
if filereadable(expand("~/.gvimrc.local"))
source ~/.gvimrc.local
endif
endif
" }
  • .vimrc.before - spf13-vim before configuration
  • .vimrc.before.fork - fork before configuration
  • .vimrc.before.local - before user configuration
  • .vimrc.bundles - spf13-vim bundle configuration
  • .vimrc.bundles.fork - fork bundle configuration
  • .vimrc.bundles.local - local user bundle configuration
  • .vimrc - spf13-vim vim configuration
  • .vimrc.fork - fork vim configuration
  • .vimrc.local - local user configuration

低端玩家可以用.vimrc.*.local来做本地化定制,高端玩家可以fork repo,然后添加自己的.vimrc.*.fork做定制,然后commit,然后所有你的Vim环境就都可以用到这一份配置了。
完工后的Vim看起来这样

问题

powerline fonts

Windows

要完成这样的效果,最纠结的就是powerline的字体。如果字体不正确,是没法出现这样的小尖角样式的。

" .vimrc.bundles
Bundle 'powerline/fonts'

所有的字体文件都被安装到$HOME\.vim\bundle\fonts\。然后我们再调用其中的power script脚本install.ps1,就会安装到系统字体目录。然后在Vim中选择对应的字体就好了。

Linux

在Linux,尤其是通过putty访问Linux的时候就没这么简单了。因为我全是用putty ssh访问Linux,所以一开始以为是putty的问题,网上搜出来一堆,都是这样的解释。

  1. Download the patched fonts. I chose DejaVuSansMono as my font since I like it most.
  2. Install this font in Windows to make it accessible for all programs.
  3. Open PuTTY and make changes to the settings:
  4. Under appearance select the patched font
  5. Select font quality Clear Type
  6. Under Translation select character set UTF-8
  7. Apply settings and restart the PuTTY session

但按照这样做完,却并不起效果。后来发现在Linux本机的图形界面下用Terminal开Vim也是一样的情况。
在终端的profile中配置字体还是一样,没有效果。
然后找到Powerline的官方文档——Powerline

  1. Move the patched font to a valid X font path. Valid font paths can be listed with xset q:

    mv 'SomeFont for Powerline.otf' ~/.fonts/

  2. Update font cache for the path the font was moved to (root priveleges may be needed for updating font cache for some paths):

    fc-cache -vf ~/.fonts/

颇受启发。
经过分析,得出原因是:

  1. Linux的桌面系统分为GNome,KDE等等。不同 的桌面系统所采用的字体目录是不同的。我们目前采用的字体目录在
[benzhou@plslx111 phx]$ xset q|grep font
catalogue:/etc/X11/fontpath.d,built-ins
  1. 不同的桌面环境也对应不同的终端,比如我们目前使用的就是KDE下的Konsole。
  2. 修改install.sh,将字体安装到对应的字体目录下,并刷新字体缓存
  3. 配置好Konsole
  4. putty下面一切完美

Update - 20170606
Linux上的字体安装始终不成功,在Terminal里面无法找到powerline字体。
最后的解决办法是,字体安装目录在/usr/share/fonts。把~/.vim/bundles/fonts/install.sh中的font_dir设置为对应的路径,再运行就OK了

字体问题

Windows下使用Courier New字体感觉怪怪的,原因是设置了let g:spf13_no_big_font,但因为后面采用了Source Code Pro for Powerline字体,所以就还是保持1。

neocomplete不支持

装完spf13并没有深究每一个插件的用法,偶尔看到对neocomplete的推荐,所以想试一试。在命令行中输入:NeoCompleteEnable,发现竟然不支持,原来neocomplete需要Vim对lua的支持。有两条路可以走:

  1. 安装vim-nox/vim-athena/vim-gtk/vim-gnome其中之一
  2. 自行编译Vim,并使能lua支持
    因为我们使用的发行版过于陈旧,上面提到的四个包都没有,所以只能走第二条路了。
    命令只有两条:
[benzhou@plslx111 vim]$ ./configure  --enable-rubyinterp \
--enable-pythoninterp \
--with-python-config-dir=/usr/bin/python2.6-config \ # 这一行python的配置也要对
--enable-perlinterp \
--enable-gui=gtk2 \
--enable-cscope \
--prefix=/usr/local \
--enable-luainterp \ # 下面两行最重要
--with-lua-prefix=/usr/bn #
[benzhou@plslx111 vim]$ make

编译Vim,在configure的时候,发现找不到lua的头文件,搜索一番,原因是没有安装lua-devel的包,于是在网上找了一个Fedora的RPM包,运行rpm -i安装,再configure就成了。
打开Vim命令行,运行echo has("lua"),结果是1,就大功告成了。

Error 523

在写Vim脚本的时候,一个很简单的函数出现了E523: Not allowed here。最后查明原因是在被map的命令中不能调用execute和normal函数,据说是为了安全。详情查看:help e523。下面是错误的.vimrc代码

function CscopeFind()
execute "cscope add cscope.out"
endfunction

nmap <space>s :<C-R>=CscopeFind()<CR>cscope find s <C-R>=expand('<cword>')<CR>

if 判断字符串

这是一个坑,和Python等主流语言不同,Vim脚本里,if后面的字符串会被强制转换成bool变量,再进行判断。而字符串对应的bool值为0。所以下面的echo语句永远也执行不到。

if (glob('/existing/file'))
echo ('yes')
endif

正确的写法应当是

if (!empty(glob('...'))
...
endif

插件

ctrlP

有了这个插件,cscope find f就没啥大用了。看官网的基本用法:

  • Press <F5> to purge the cache for the current directory to get new files, remove deleted files and apply new ignore options.
  • Press <c-f> and <c-b> to cycle between modes. (Ben: 在搜索文件,buffer, funky之间循环跳转)
  • Press <c-d> to switch to filename only search instead of full path. (Ben: 这个对于我比较有用,因为我们的路径太长)
  • Press <c-r> to switch to regexp mode.
  • Use <c-j>, <c-k> or the arrow keys to navigate the result list.
  • Use <c-t> or <c-v>, <c-x> to open the selected entry in a new tab or in a new split.
  • Use <c-n>, <c-p> to select the next/previous string in the prompt’s history.
  • Use <c-y> to create a new file and its parent directories.
  • Use <c-z> to mark/unmark multiple files and to open them.

目前我的配置

  • <leader>-f: 模糊搜索最近打开的文件(MRU)
  • <leader>-p: 模糊搜索当前目录及其子目录下的所有文件
  • <leader>fu: 进入当前文件的函数列表搜索
  • <leader>fU: 搜索当前光标下单词对应的函数

neocomplete

omni-complete

这是在Vim 7.0引入的自动补全功能,不依赖于任何插件。在没有neocomplete的时候,比如Windows下装neocomplete和ycm都很麻烦,就只能用原生的补全。其实也够用了,只是没有那么方便。
所有帮助命令入口:

:help ins-completion
:help compl-omni
:help ‘omnifunc’
:help i_CTRL-X_CTRL-O
:help ins-completion-menu
:help popupmenu-keys
:help ‘completeopt’
:help compl-omni-filetypes
:help omnicppcomplete.txt

看spf13的.vimrc中有这么一句:Automatically open and close the popup menu / preview window
本来以为omni-complete会像neocomplete一样自动弹出下拉菜单,但是经过一番Google和help研读。这个理解是错误的。

  • <C-N>/<C-P> 这就不提了,最基本的功能,N=next,P=previous
  • <CTRL-K> 字典补全,查询dictionary参数中对应的文件,进行补全
  • <CTRL-I> 当前文件和包含文件补全,I=include
  • <CTRL-]> tag补全
  • <CTRL-F> 文件名补全
  • <CTRL-D> 宏补全
  • <CTRL-V> Vim命令补全,V=Vim
  • <CTRL-U> User defined补全,U=User,补全函数通过set completefunc=xxx传入
  • <CTRL-O> omni补全,O=Omni,补全函数通过set omnifunc=xxx传入,这个Vim自带的脚本都已写好,spf13的vimrc也有参考设置了

UltiSnips

在折腾半天snipMate无果的情况下,转投UltiSnips。语法复杂度以及灵活性要强于snipMate。注意以下几点即可

  • UltiSnips搜索所有runtimepath下的UltiSnips目录中的.snippet结尾的文件
  • UltiSnips通过一个全局变量决定是否支持snipMate
  • UltiSnips用<tab>补全,与omni-complete冲突,不过omni-complete还有一种用法是<C-N>/<C-P>来切换选项,也凑合能用了
%23Vim%u7EC8%u6781%u6298%u817E%0A@%28%u5B66%u4E60%u7B14%u8BB0%29%5Bvim%5D%0A%u672C%u7BC7%u4ECE%u8FD9%u91CC%u5F00%u59CB%0A%5B%u8DDF%u6211%u4E00%u8D77%u5B66%u4E60VIM%20-%20The%20Life%20Changing%20Editor%5D%28http%3A//ju.outofmemory.cn/entry/79671%29%0A%u7136%u540E%u5230%u8FD9%u91CC%u8FDB%u5165%u9AD8%u6F6E%0A%5Bspf13/spf13-vim%5D%28https%3A//github.com/spf13/spf13-vim%29%0A%u7B2C%u4E00%u7BC7%u770B%u4F3C%u662F%u4E00%u7BC7%u666E%u666E%u901A%u901A%u4ECB%u7ECDVim%u63D2%u4EF6%u7684%u535A%u5BA2%u6587%u3002%u4F46%u662F%u4ED6%u5728%u4ECB%u7ECD%u4E00%u4E9B%u9876%u7EA7%u63D2%u4EF6%u7684%u540C%u65F6%uFF0C%u5F15%u5165%u4E86spf13%u8FD9%u4E2AGitHub%u4E0A%u7684%u5F00%u6E90%u9879%u76EE%u3002%u8FD9%u5176%u5B9E%u5C31%u662F%u4E00%u4E2A%u5F88%u5168%u7684vimrc%u914D%u7F6E%u3002%u73A9%u5BB6%u53EF%u4EE5%u6839%u636E%u81EA%u5DF1%u7684%u9700%u6C42%u8FDB%u884C%u5B9A%u5236%u3002%u8FD9%u4E00%u4E0B%u6253%u5F00%u4E86%u6F58%u591A%u62C9%u9B54%u76D2%uFF0C%u672C%u6765%u6015%u9EBB%u70E6%u6298%u817E%u7684%u6211%uFF0C%u4E00%u4E0B%u5C31%u5165%u5751%u4E86%u3002%0A%21%5BAlt%20text%5D%28./1490225064940.png%29%0A%u9488%u5BF9%u5404%u4E2A%u63D2%u4EF6%u7684%u914D%u7F6E%u90FD%u7528fold%u5904%u7406%u597D%u4E86%uFF0C%u4E00%u76EE%u4E86%u7136%u3002%u4F5C%u8005%u4E5F%u63D0%u4F9B%u597D%u4E86%u5B9A%u5236%u7684%u63A5%u53E3%0A%60%60%60%0A%22%20Use%20fork%20vimrc%20if%20available%20%7B%0A%20%20%20%20if%20filereadable%28expand%28%22%7E/.vimrc.fork%22%29%29%0A%20%20%20%20%20%20%20%20source%20%7E/.vimrc.fork%0A%20%20%20%20endif%0A%22%20%7D%0A%0A%22%20Use%20local%20vimrc%20if%20available%20%7B%0A%20%20%20%20if%20filereadable%28expand%28%22%7E/.vimrc.local%22%29%29%0A%20%20%20%20%20%20%20%20source%20%7E/.vimrc.local%0A%20%20%20%20endif%0A%22%20%7D%0A%0A%22%20Use%20local%20gvimrc%20if%20available%20and%20gui%20is%20running%20%7B%0A%20%20%20%20if%20has%28%27gui_running%27%29%0A%20%20%20%20%20%20%20%20if%20filereadable%28expand%28%22%7E/.gvimrc.local%22%29%29%0A%20%20%20%20%20%20%20%20%20%20%20%20source%20%7E/.gvimrc.local%0A%20%20%20%20%20%20%20%20endif%0A%20%20%20%20endif%0A%22%20%7D%0A%60%60%60%0A-%20.vimrc.before%20-%20spf13-vim%20before%20configuration%0A-%20.vimrc.before.fork%20-%20fork%20before%20configuration%0A-%20.vimrc.before.local%20-%20before%20user%20configuration%0A-%20.vimrc.bundles%20-%20spf13-vim%20bundle%20configuration%0A-%20.vimrc.bundles.fork%20-%20fork%20bundle%20configuration%0A-%20.vimrc.bundles.local%20-%20local%20user%20bundle%20configuration%0A-%20.vimrc%20-%20spf13-vim%20vim%20configuration%0A-%20.vimrc.fork%20-%20fork%20vim%20configuration%0A-%20.vimrc.local%20-%20local%20user%20configuration%0A%0A%u4F4E%u7AEF%u73A9%u5BB6%u53EF%u4EE5%u7528.vimrc.*.local%u6765%u505A%u672C%u5730%u5316%u5B9A%u5236%uFF0C%u9AD8%u7AEF%u73A9%u5BB6%u53EF%u4EE5fork%20repo%uFF0C%u7136%u540E%u6DFB%u52A0%u81EA%u5DF1%u7684.vimrc.%5C*.fork%u505A%u5B9A%u5236%uFF0C%u7136%u540Ecommit%uFF0C%u7136%u540E%u6240%u6709%u4F60%u7684Vim%u73AF%u5883%u5C31%u90FD%u53EF%u4EE5%u7528%u5230%u8FD9%u4E00%u4EFD%u914D%u7F6E%u4E86%u3002%0A%u5B8C%u5DE5%u540E%u7684Vim%u770B%u8D77%u6765%u8FD9%u6837%0A%21%5BAlt%20text%5D%28./1490225459173.png%29%0A%0A%23%23%20%u95EE%u9898%0A%23%23%23powerline%20fonts%0A%23%23%23%23Windows%0A%u8981%u5B8C%u6210%u8FD9%u6837%u7684%u6548%u679C%uFF0C%u6700%u7EA0%u7ED3%u7684%u5C31%u662Fpowerline%u7684%u5B57%u4F53%u3002%u5982%u679C%u5B57%u4F53%u4E0D%u6B63%u786E%uFF0C%u662F%u6CA1%u6CD5%u51FA%u73B0%u8FD9%u6837%u7684%u5C0F%u5C16%u89D2%u6837%u5F0F%u7684%u3002%0A%21%5BAlt%20text%5D%28./1490235175286.png%29%0A%u5B9E%u73B0%u8FD9%u4E2A%u6837%u5F0F%u7684%u63D2%u4EF6%u662F%60vim-airline%60%u3002%u5B83%u4F9D%u8D56%u4E8Epowerline%20fonts%u3002%u5728Windows%u4E0B%u8FD8%u662F%u6BD4%u8F83%u597D%u89E3%u51B3%u7684%u3002spf13%u5DF2%u7ECF%u5E2E%u6211%u4EEC%u88C5%u597D%u4E86powerline-fonts%u3002%0A%60%60%60%0A%22%20.vimrc.bundles%0ABundle%20%27powerline/fonts%27%0A%60%60%60%0A%u6240%u6709%u7684%u5B57%u4F53%u6587%u4EF6%u90FD%u88AB%u5B89%u88C5%u5230%60%24HOME%5C.vim%5Cbundle%5Cfonts%5C%60%u3002%u7136%u540E%u6211%u4EEC%u518D%u8C03%u7528%u5176%u4E2D%u7684power%20script%u811A%u672C%60install.ps1%60%uFF0C%u5C31%u4F1A%u5B89%u88C5%u5230%u7CFB%u7EDF%u5B57%u4F53%u76EE%u5F55%u3002%u7136%u540E%u5728Vim%u4E2D%u9009%u62E9%u5BF9%u5E94%u7684%u5B57%u4F53%u5C31%u597D%u4E86%u3002%0A%23%23%23%23Linux%0A%u5728Linux%uFF0C%u5C24%u5176%u662F%u901A%u8FC7putty%u8BBF%u95EELinux%u7684%u65F6%u5019%u5C31%u6CA1%u8FD9%u4E48%u7B80%u5355%u4E86%u3002%u56E0%u4E3A%u6211%u5168%u662F%u7528putty%20ssh%u8BBF%u95EELinux%uFF0C%u6240%u4EE5%u4E00%u5F00%u59CB%u4EE5%u4E3A%u662Fputty%u7684%u95EE%u9898%uFF0C%u7F51%u4E0A%u641C%u51FA%u6765%u4E00%u5806%uFF0C%u90FD%u662F%u8FD9%u6837%u7684%u89E3%u91CA%u3002%0A%3E1.%20Download%20the%20patched%20fonts.%20I%20chose%20DejaVuSansMono%20as%20my%20font%20since%20I%20like%20it%20most.%0A%3E2.%20Install%20this%20font%20in%20Windows%20to%20make%20it%20accessible%20for%20all%20programs.%0A%3E3.%20Open%20PuTTY%20and%20make%20changes%20to%20the%20settings%3A%0A%3E4.%20Under%20appearance%20select%20the%20patched%20font%0A%3E5.%20Select%20font%20quality%20Clear%20Type%0A%3E6.%20Under%20Translation%20select%20character%20set%20UTF-8%0A%3E7.%20Apply%20settings%20and%20restart%20the%20PuTTY%20session%0A%0A%u4F46%u6309%u7167%u8FD9%u6837%u505A%u5B8C%uFF0C%u5374%u5E76%u4E0D%u8D77%u6548%u679C%u3002%u540E%u6765%u53D1%u73B0%u5728Linux%u672C%u673A%u7684%u56FE%u5F62%u754C%u9762%u4E0B%u7528Terminal%u5F00Vim%u4E5F%u662F%u4E00%u6837%u7684%u60C5%u51B5%u3002%0A%u5728%u7EC8%u7AEF%u7684profile%u4E2D%u914D%u7F6E%u5B57%u4F53%u8FD8%u662F%u4E00%u6837%uFF0C%u6CA1%u6709%u6548%u679C%u3002%0A%u7136%u540E%u627E%u5230Powerline%u7684%u5B98%u65B9%u6587%u6863%u2014%u2014%5BPowerline%5D%28http%3A//powerline.readthedocs.io/en/master/installation/linux.html%23fonts-installation%29%0A%3E%201.%20Move%20the%20patched%20font%20to%20a%20valid%20X%20font%20path.%20Valid%20font%20paths%20can%20be%20listed%20with%20%60xset%20q%60%3A%0A%3E%0A%09%60mv%20%27SomeFont%20for%20Powerline.otf%27%20%7E/.fonts/%60%0A%09%0A%3E%202.%20Update%20font%20cache%20for%20the%20path%20the%20font%20was%20moved%20to%20%28root%20priveleges%20may%20be%20needed%20for%20updating%20font%20cache%20for%20some%20paths%29%3A%0A%3E%0A%09%60fc-cache%20-vf%20%7E/.fonts/%60%0A%09%0A%u9887%u53D7%u542F%u53D1%u3002%0A%u7ECF%u8FC7%u5206%u6790%uFF0C%u5F97%u51FA%u539F%u56E0%u662F%uFF1A%0A1.%20Linux%u7684%u684C%u9762%u7CFB%u7EDF%u5206%u4E3AGNome%uFF0CKDE%u7B49%u7B49%u3002%u4E0D%u540C%20%u7684%u684C%u9762%u7CFB%u7EDF%u6240%u91C7%u7528%u7684%u5B57%u4F53%u76EE%u5F55%u662F%u4E0D%u540C%u7684%u3002%u6211%u4EEC%u76EE%u524D%u91C7%u7528%u7684%u5B57%u4F53%u76EE%u5F55%u5728%0A%60%60%60bash%0A%5Bbenzhou@plslx111%20phx%5D%24%20xset%20q%7Cgrep%20font%0A%20%20catalogue%3A/etc/X11/fontpath.d%2Cbuilt-ins%0A%60%60%60%0A2.%20%u4E0D%u540C%u7684%u684C%u9762%u73AF%u5883%u4E5F%u5BF9%u5E94%u4E0D%u540C%u7684%u7EC8%u7AEF%uFF0C%u6BD4%u5982%u6211%u4EEC%u76EE%u524D%u4F7F%u7528%u7684%u5C31%u662FKDE%u4E0B%u7684Konsole%u3002%0A3.%20%u4FEE%u6539install.sh%uFF0C%u5C06%u5B57%u4F53%u5B89%u88C5%u5230%u5BF9%u5E94%u7684%u5B57%u4F53%u76EE%u5F55%u4E0B%uFF0C%u5E76%u5237%u65B0%u5B57%u4F53%u7F13%u5B58%0A4.%20%u914D%u7F6E%u597DKonsole%0A5.%20putty%u4E0B%u9762%u4E00%u5207%u5B8C%u7F8E%0A%0A%3E%20Update%20-%2020170606%0A%3E%20Linux%u4E0A%u7684%u5B57%u4F53%u5B89%u88C5%u59CB%u7EC8%u4E0D%u6210%u529F%uFF0C%u5728Terminal%u91CC%u9762%u65E0%u6CD5%u627E%u5230powerline%u5B57%u4F53%u3002%0A%3E%20%u6700%u540E%u7684%u89E3%u51B3%u529E%u6CD5%u662F%uFF0C%u5B57%u4F53%u5B89%u88C5%u76EE%u5F55%u5728/usr/share/fonts%u3002%u628A%7E/.vim/bundles/fonts/install.sh%u4E2D%u7684font_dir%u8BBE%u7F6E%u4E3A%u5BF9%u5E94%u7684%u8DEF%u5F84%uFF0C%u518D%u8FD0%u884C%u5C31OK%u4E86%0A%0A%23%23%23%u5B57%u4F53%u95EE%u9898%0AWindows%u4E0B%u4F7F%u7528Courier%20New%u5B57%u4F53%u611F%u89C9%u602A%u602A%u7684%uFF0C%u539F%u56E0%u662F%u8BBE%u7F6E%u4E86%60let%20g%3Aspf13_no_big_font%60%uFF0C%u4F46%u56E0%u4E3A%u540E%u9762%u91C7%u7528%u4E86Source%20Code%20Pro%20for%20Powerline%u5B57%u4F53%uFF0C%u6240%u4EE5%u5C31%u8FD8%u662F%u4FDD%u63011%u3002%0A%0A%23%23%23neocomplete%u4E0D%u652F%u6301%0A%u88C5%u5B8Cspf13%u5E76%u6CA1%u6709%u6DF1%u7A76%u6BCF%u4E00%u4E2A%u63D2%u4EF6%u7684%u7528%u6CD5%uFF0C%u5076%u5C14%u770B%u5230%u5BF9neocomplete%u7684%u63A8%u8350%uFF0C%u6240%u4EE5%u60F3%u8BD5%u4E00%u8BD5%u3002%u5728%u547D%u4EE4%u884C%u4E2D%u8F93%u5165%3ANeoCompleteEnable%uFF0C%u53D1%u73B0%u7ADF%u7136%u4E0D%u652F%u6301%uFF0C%u539F%u6765neocomplete%u9700%u8981Vim%u5BF9lua%u7684%u652F%u6301%u3002%u6709%u4E24%u6761%u8DEF%u53EF%u4EE5%u8D70%uFF1A%0A1.%20%u5B89%u88C5vim-nox/vim-athena/vim-gtk/vim-gnome%u5176%u4E2D%u4E4B%u4E00%0A2.%20%u81EA%u884C%u7F16%u8BD1Vim%uFF0C%u5E76%u4F7F%u80FDlua%u652F%u6301%0A%u56E0%u4E3A%u6211%u4EEC%u4F7F%u7528%u7684%u53D1%u884C%u7248%u8FC7%u4E8E%u9648%u65E7%uFF0C%u4E0A%u9762%u63D0%u5230%u7684%u56DB%u4E2A%u5305%u90FD%u6CA1%u6709%uFF0C%u6240%u4EE5%u53EA%u80FD%u8D70%u7B2C%u4E8C%u6761%u8DEF%u4E86%u3002%0A%u547D%u4EE4%u53EA%u6709%u4E24%u6761%uFF1A%0A%60%60%60bash%0A%5Bbenzhou@plslx111%20vim%5D%24%20./configure%20%20--enable-rubyinterp%20%5C%0A%09%09%09%09%09%09%20%20%20%20%20%20%20%20%20%20%20%20%20--enable-pythoninterp%20%5C%20%20%20%20%20%20%20%20%20%20%20%20%0A%09%09%09%09%09%09%20%20%20%20%20%20%20%20%20%20%20%20%20--with-python-config-dir%3D/usr/bin/python2.6-config%20%5C%20%23%20%u8FD9%u4E00%u884Cpython%u7684%u914D%u7F6E%u4E5F%u8981%u5BF9%20%20%20%20%20%20%20%20%0A%09%09%09%09%09%09%20%20%20%20%20%20%20%20%20%20%20%20%20--enable-perlinterp%20%5C%20%20%20%20%20%20%20%20%20%20%20%20%0A%09%09%09%09%09%09%20%20%20%20%20%20%20%20%20%20%20%20%20--enable-gui%3Dgtk2%20%5C%0A%09%09%09%09%09%09%20%20%20%20%20%20%20%20%20%20%20%20%20--enable-cscope%20%5C%0A%09%09%09%09%09%09%20%20%20%20%20%20%20%20%20%20%20%20%20--prefix%3D/usr/local%20%5C%20%20%20%20%20%20%20%20%20%20%20%20%0A%09%09%09%09%09%09%20%20%20%20%20%20%20%20%20%20%20%20%20--enable-luainterp%20%5C%20%20%20%20%20%20%20%20%20%20%23%20%u4E0B%u9762%u4E24%u884C%u6700%u91CD%u8981%20%0A%09%09%09%09%09%09%20%20%20%20%20%20%20%20%20%20%20%20%20--with-lua-prefix%3D/usr/bn%20%20%20%20%20%23%0A%5Bbenzhou@plslx111%20vim%5D%24%20make%0A%60%60%60%0A%u7F16%u8BD1Vim%uFF0C%u5728configure%u7684%u65F6%u5019%uFF0C%u53D1%u73B0%u627E%u4E0D%u5230lua%u7684%u5934%u6587%u4EF6%uFF0C%u641C%u7D22%u4E00%u756A%uFF0C%u539F%u56E0%u662F%u6CA1%u6709%u5B89%u88C5lua-devel%u7684%u5305%uFF0C%u4E8E%u662F%u5728%u7F51%u4E0A%u627E%u4E86%u4E00%u4E2AFedora%u7684RPM%u5305%uFF0C%u8FD0%u884Crpm%20-i%u5B89%u88C5%uFF0C%u518Dconfigure%u5C31%u6210%u4E86%u3002%0A%u6253%u5F00Vim%u547D%u4EE4%u884C%uFF0C%u8FD0%u884C%60echo%20has%28%22lua%22%29%60%uFF0C%u7ED3%u679C%u662F1%uFF0C%u5C31%u5927%u529F%u544A%u6210%u4E86%u3002%0A%0A%23%23%23Error%20523%0A%u5728%u5199Vim%u811A%u672C%u7684%u65F6%u5019%uFF0C%u4E00%u4E2A%u5F88%u7B80%u5355%u7684%u51FD%u6570%u51FA%u73B0%u4E86%60E523%3A%20Not%20allowed%20here%60%u3002%u6700%u540E%u67E5%u660E%u539F%u56E0%u662F%u5728%u88ABmap%u7684%u547D%u4EE4%u4E2D%u4E0D%u80FD%u8C03%u7528execute%u548Cnormal%u51FD%u6570%uFF0C%u636E%u8BF4%u662F%u4E3A%u4E86%u5B89%u5168%u3002%u8BE6%u60C5%u67E5%u770B%60%3Ahelp%20e523%60%u3002%u4E0B%u9762%u662F%u9519%u8BEF%u7684.vimrc%u4EE3%u7801%0A%60%60%60%0Afunction%20CscopeFind%28%29%0A%09execute%20%22cscope%20add%20cscope.out%22%0Aendfunction%0A%0Anmap%20%3Cspace%3Es%20%3A%3CC-R%3E%3DCscopeFind%28%29%3CCR%3Ecscope%20find%20s%20%3CC-R%3E%3Dexpand%28%27%3Ccword%3E%27%29%3CCR%3E%0A%60%60%60%0A%0A%23%23%23if%20%u5224%u65AD%u5B57%u7B26%u4E32%0A%u8FD9%u662F%u4E00%u4E2A%u5751%uFF0C%u548CPython%u7B49%u4E3B%u6D41%u8BED%u8A00%u4E0D%u540C%uFF0CVim%u811A%u672C%u91CC%uFF0Cif%u540E%u9762%u7684%u5B57%u7B26%u4E32%u4F1A%u88AB%u5F3A%u5236%u8F6C%u6362%u6210bool%u53D8%u91CF%uFF0C%u518D%u8FDB%u884C%u5224%u65AD%u3002%u800C%u5B57%u7B26%u4E32%u5BF9%u5E94%u7684bool%u503C%u4E3A0%u3002%u6240%u4EE5%u4E0B%u9762%u7684echo%u8BED%u53E5%u6C38%u8FDC%u4E5F%u6267%u884C%u4E0D%u5230%u3002%0A%60%60%60vim%0Aif%20%28glob%28%27/existing/file%27%29%29%0A%09echo%20%28%27yes%27%29%0Aendif%0A%60%60%60%0A%u6B63%u786E%u7684%u5199%u6CD5%u5E94%u5F53%u662F%0A%60%60%60vim%0Aif%20%28%21empty%28glob%28%27...%27%29%29%0A...%0Aendif%0A%60%60%60%0A%23%23%20%u63D2%u4EF6%0A%23%23%23ctrlP%0A%u6709%u4E86%u8FD9%u4E2A%u63D2%u4EF6%uFF0C%60cscope%20find%20f%60%u5C31%u6CA1%u5565%u5927%u7528%u4E86%u3002%u770B%u5B98%u7F51%u7684%u57FA%u672C%u7528%u6CD5%uFF1A%0A%3E-%20Press%20%60%3CF5%3E%60%20to%20purge%20the%20cache%20for%20the%20current%20directory%20to%20get%20new%20files%2C%20remove%20deleted%20files%20and%20apply%20new%20ignore%20options.%0A%3E-%20Press%20%60%3Cc-f%3E%60%20and%20%60%3Cc-b%3E%60%20to%20cycle%20between%20modes.%20%28Ben%3A%20%u5728%u641C%u7D22%u6587%u4EF6%uFF0Cbuffer%uFF0C%20funky%u4E4B%u95F4%u5FAA%u73AF%u8DF3%u8F6C%29%0A%3E-%20Press%20%60%3Cc-d%3E%60%20to%20switch%20to%20filename%20only%20search%20instead%20of%20full%20path.%20%28Ben%3A%20%u8FD9%u4E2A%u5BF9%u4E8E%u6211%u6BD4%u8F83%u6709%u7528%uFF0C%u56E0%u4E3A%u6211%u4EEC%u7684%u8DEF%u5F84%u592A%u957F%29%0A%3E-%20Press%20%60%3Cc-r%3E%60%20to%20switch%20to%20regexp%20mode.%0A%3E-%20Use%20%60%3Cc-j%3E%60%2C%20%60%3Cc-k%3E%60%20or%20the%20arrow%20keys%20to%20navigate%20the%20result%20list.%0A%3E-%20Use%20%60%3Cc-t%3E%60%20or%20%60%3Cc-v%3E%60%2C%20%60%3Cc-x%3E%60%20to%20open%20the%20selected%20entry%20in%20a%20new%20tab%20or%20in%20a%20new%20split.%0A%3E-%20Use%20%60%3Cc-n%3E%60%2C%20%60%3Cc-p%3E%60%20to%20select%20the%20next/previous%20string%20in%20the%20prompt%27s%20history.%0A%3E-%20Use%20%60%3Cc-y%3E%60%20to%20create%20a%20new%20file%20and%20its%20parent%20directories.%0A%3E-%20Use%20%60%3Cc-z%3E%60%20to%20mark/unmark%20multiple%20files%20and%20%3Cc-o%3E%20to%20open%20them.%0A%0A%u76EE%u524D%u6211%u7684%u914D%u7F6E%0A-%20%26lt%3Bleader%26gt%3B-f%3A%20%u6A21%u7CCA%u641C%u7D22%u6700%u8FD1%u6253%u5F00%u7684%u6587%u4EF6%28MRU%29%0A-%20%26lt%3Bleader%26gt%3B-p%3A%20%u6A21%u7CCA%u641C%u7D22%u5F53%u524D%u76EE%u5F55%u53CA%u5176%u5B50%u76EE%u5F55%u4E0B%u7684%u6240%u6709%u6587%u4EF6%0A-%20%26lt%3Bleader%26gt%3Bfu%3A%20%u8FDB%u5165%u5F53%u524D%u6587%u4EF6%u7684%u51FD%u6570%u5217%u8868%u641C%u7D22%0A-%20%26lt%3Bleader%26gt%3BfU%3A%20%u641C%u7D22%u5F53%u524D%u5149%u6807%u4E0B%u5355%u8BCD%u5BF9%u5E94%u7684%u51FD%u6570%0A%0A%21%5BAlt%20text%5D%28http%3A//zuyunfei.com/images/ctrlp-vim-demo.gif%20%22Ctrlp%u6F14%u793A%22%29%0A%0A%23%23%23neocomplete%0A%0A%23%23%23omni-complete%0A%u8FD9%u662F%u5728Vim%207.0%u5F15%u5165%u7684%u81EA%u52A8%u8865%u5168%u529F%u80FD%uFF0C%u4E0D%u4F9D%u8D56%u4E8E%u4EFB%u4F55%u63D2%u4EF6%u3002%u5728%u6CA1%u6709neocomplete%u7684%u65F6%u5019%uFF0C%u6BD4%u5982Windows%u4E0B%u88C5neocomplete%u548Cycm%u90FD%u5F88%u9EBB%u70E6%uFF0C%u5C31%u53EA%u80FD%u7528%u539F%u751F%u7684%u8865%u5168%u3002%u5176%u5B9E%u4E5F%u591F%u7528%u4E86%uFF0C%u53EA%u662F%u6CA1%u6709%u90A3%u4E48%u65B9%u4FBF%u3002%0A%u6240%u6709%u5E2E%u52A9%u547D%u4EE4%u5165%u53E3%uFF1A%0A%3E%3Ahelp%20ins-completion%0A%3Ahelp%20compl-omni%0A%3Ahelp%20%27omnifunc%27%0A%3Ahelp%20i_CTRL-X_CTRL-O%0A%3Ahelp%20ins-completion-menu%0A%3Ahelp%20popupmenu-keys%0A%3Ahelp%20%27completeopt%27%0A%3Ahelp%20compl-omni-filetypes%0A%3Ahelp%20omnicppcomplete.txt%20%0A%0A%u770Bspf13%u7684.vimrc%u4E2D%u6709%u8FD9%u4E48%u4E00%u53E5%uFF1A%60Automatically%20open%20and%20close%20the%20popup%20menu%20/%20preview%20window%60%0A%u672C%u6765%u4EE5%u4E3Aomni-complete%u4F1A%u50CFneocomplete%u4E00%u6837%u81EA%u52A8%u5F39%u51FA%u4E0B%u62C9%u83DC%u5355%uFF0C%u4F46%u662F%u7ECF%u8FC7%u4E00%u756AGoogle%u548Chelp%u7814%u8BFB%u3002%u8FD9%u4E2A%u7406%u89E3%u662F%u9519%u8BEF%u7684%u3002%0A%21%5BAlt%20text%5D%28./1490863953557.png%29%0A%u6240%u6709%u7684%u4E1C%u897F%u5E76%u4E0D%u4F1A%u81EA%u52A8%u6253%u5F00%uFF0C%u9700%u8981%u7528%60%3CC-X%3E%3CC-O%3E%60%u6216%u8005%60%3CC-N%3E%60%u6216%u8005%60%3CC-P%3E%60%u3002%u5176%u5B9E%60%3CC-X%3E%60%u8FDB%u5165%u4E00%u4E2A%u63D2%u5165%u6A21%u5F0F%u7684%u5B50%u6A21%u5F0F%uFF0C%u7136%u540E%u518D%u6309%u5176%u4ED6%u7684%u7EC4%u5408%u952E%uFF0C%u53EF%u4EE5%u6709%u5BF9%u5E94%u7684%u529F%u80FD%uFF0C%u6458%u8981%u5982%u4E0B%0A*%20%60%3CC-N%3E/%3CC-P%3E%60%20%20%20%20%20%20%20%u8FD9%u5C31%u4E0D%u63D0%u4E86%uFF0C%u6700%u57FA%u672C%u7684%u529F%u80FD%uFF0CN%3Dnext%uFF0CP%3Dprevious%0A*%20%60%3CCTRL-K%3E%60%20%20%20%20%20%20%u5B57%u5178%u8865%u5168%uFF0C%u67E5%u8BE2dictionary%u53C2%u6570%u4E2D%u5BF9%u5E94%u7684%u6587%u4EF6%uFF0C%u8FDB%u884C%u8865%u5168%0A*%20%60%3CCTRL-I%3E%60%20%20%20%20%20%20%u5F53%u524D%u6587%u4EF6%u548C%u5305%u542B%u6587%u4EF6%u8865%u5168%uFF0CI%3Dinclude%0A*%20%60%3CCTRL-%5D%3E%60%20%20%20%20%20%20tag%u8865%u5168%0A*%20%60%3CCTRL-F%3E%60%20%20%20%20%20%20%u6587%u4EF6%u540D%u8865%u5168%0A*%20%60%3CCTRL-D%3E%60%20%20%20%20%20%20%u5B8F%u8865%u5168%0A*%20%60%3CCTRL-V%3E%60%20%20%20%20%20%20Vim%u547D%u4EE4%u8865%u5168%uFF0CV%3DVim%0A*%20%60%3CCTRL-U%3E%60%20%20%20%20%20%20User%20defined%u8865%u5168%uFF0CU%3DUser%uFF0C%u8865%u5168%u51FD%u6570%u901A%u8FC7%60set%20completefunc%3Dxxx%60%u4F20%u5165%0A*%20%60%3CCTRL-O%3E%60%20%20%20%20%20%20omni%u8865%u5168%uFF0CO%3DOmni%uFF0C%u8865%u5168%u51FD%u6570%u901A%u8FC7%60set%20omnifunc%3Dxxx%60%u4F20%u5165%uFF0C%u8FD9%u4E2AVim%u81EA%u5E26%u7684%u811A%u672C%u90FD%u5DF2%u5199%u597D%uFF0Cspf13%u7684vimrc%u4E5F%u6709%u53C2%u8003%u8BBE%u7F6E%u4E86%0A%0A%23%23%23UltiSnips%0A%u5728%u6298%u817E%u534A%u5929snipMate%u65E0%u679C%u7684%u60C5%u51B5%u4E0B%uFF0C%u8F6C%u6295UltiSnips%u3002%u8BED%u6CD5%u590D%u6742%u5EA6%u4EE5%u53CA%u7075%u6D3B%u6027%u8981%u5F3A%u4E8EsnipMate%u3002%u6CE8%u610F%u4EE5%u4E0B%u51E0%u70B9%u5373%u53EF%0A-%20UltiSnips%u641C%u7D22%u6240%u6709runtimepath%u4E0B%u7684UltiSnips%u76EE%u5F55%u4E2D%u7684.snippet%u7ED3%u5C3E%u7684%u6587%u4EF6%0A-%20UltiSnips%u901A%u8FC7%u4E00%u4E2A%u5168%u5C40%u53D8%u91CF%u51B3%u5B9A%u662F%u5426%u652F%u6301snipMate%0A-%20UltiSnips%u7528%60%3Ctab%3E%60%u8865%u5168%uFF0C%u4E0Eomni-complete%u51B2%u7A81%uFF0C%u4E0D%u8FC7omni-complete%u8FD8%u6709%u4E00%u79CD%u7528%u6CD5%u662F%60%3CC-N%3E%60/%60%3CC-P%3E%60%u6765%u5207%u6362%u9009%u9879%uFF0C%u4E5F%u51D1%u5408%u80FD%u7528%u4E86%0A%0A

Edit


曾经谈到面相对象,就是C++,之后最多加上Java。后来,读过的书上说,面向对象其实只是一种设计方式——万事即对象,而不在乎用的是什么语言。C语言,一样也可以做面向对象设计,有时候还更自由,没有C++的各种隐藏的坑。这篇笔记就具体说说如何用C来做面向对象。
开始之前,用一篇网上收录的笔记作为入门:用C语言写面向的对象是一种什么样的体验

封装

对类结构的封装,C里面用struct,相比C++缺少的就是:

  • 构造函数和析构函数
  • private & protect
/** Class definition of BaseLed.
* The concrete LED class should inherit from this class.
*/
typedef struct BaseLed
{
uint8_t id;
/**
* Below 4 members are used to record the LED blinking state.
*/
LedState state;
uint16_t onTime;
uint16_t offTime;
uint16_t blinkCount;
/**
* Public function members.
*/
bool (*init)(struct BaseLed*, uint8_t); //!< Init hardware if needed
void (*switchit)(struct BaseLed*, LedState); //!< Turn on/off
void (*set_brightness)(struct BaseLed*, uint8_t); //!< Set LED's brightness level
}BaseLed;

上面的例子,上半部分是成员变量,下半部分是成员函数。

构造函数&析构函数

C++中,构造函数在以下情况中调用:

  • 比如你在main里面声明了一个类A..那么~A()会在main结束时调用
  • 如果在自定义的函数f()里面声明了一个A 函数f结束的时候就会调用~A()
  • 或者你delete 指向A的指针..
  • 或者显式的调用析构函数

C++各种隐藏的坑,参考网页那些被C++默默地声明和调用的函数

在C面向对象里面,就没有这种问题,因为C语言根本不会为你生成隐藏的函数,也不会帮你调用构造函数,分配内存之类的,也没有虚函数表(VTable)。所有这一切都得自己亲力亲为。不过这样也好,所见即所得嘛。

// ClassA.h
typedef struct _ClassA
{
int a;
void (*funcA)(struct _ClassA*, int);
}ClassA;
ClassA* newA(int);
void deleteA(ClassA*)
// ClassA.c
#include <ClassA.h>
void funcA(ClassA* thisObj, int a)
{
}
ClassA* newA(int a)
{
ClassA* Aobj = malloc(sizeof(ClassA));
Aobj->funcA = funcA;
Aobj->a = a;
}
void deleteA(ClassA* obj)
{
free(obj);
}
// main.c
#include <ClassA.h>
void main()
{
ClassA* aobj = newA(0);
}

在头文件中定义结构体,在.c文件中定义成员函数,在需要调用的地方include对应的头文件。注意,当main函数和ClassA定义在不同的lib中时,头文件中的newA, deleteA需要定义为extern。

访问控制——private & protect

上面的例子是最基本的封装定义。C++中最有趣的是访问控制,即隐藏具体实现,而只暴露接口。

private

怎么做到private呢?用结构体把private包起来。

// ClassA.c
#include <ClassA_Private.h>
#include <ClassA.h>
ClassA* newA(int a)
{
ClassA* pA = malloc(sizeof(ClassA));
pA->private_data = malloc(sizeof(ClassA_Private));
ClassA_Private* pPrivate = (ClassA_Private*)(pA->private_data);
pPrivate->private_a = a;
}
...
// ClassA.h
typedef struct _ClassA
{
int a;
void (*funcA)(struct _ClassA*, int);
void* private_data; // void pointer hide all details
}ClassA;
// ClassA_Private.h
struct _ClassA;
typedef _ClassA_Private
{
int private_a;
void (*private_funcA)(struct _ClassA*, int);
}ClassA_Private
// main.c
#include <ClassA.h>
void main()
{
ClassA* obj = newA(0);
// Below line will bring compile error.
// obj->private_data.private_funcA
}

将不想暴露的数据放在一个结构体中,类定义的头文件包含该头文件。对于想隐藏的客户代码,不暴露该头文件,则其不能引用该private代码。

protect

protect数据是只能暴露给子类和友元,关于友元,参考【C++基础之十】友元函数和友元类
在C里面,就没这么多花头了,“对于想隐藏的客户代码,不暴露该头文件”。

// ChildClassA.h
#include <BaseClassA.h>
#include <BaseClassA_Private.h>
typedef struct _ChildClassA
{
BaseClassA parent;
...
}
// FriendClassA.h
#include <BaseClassA.h>
#include <BaseClassA_Private.h>
typedef struct _ChildClassA
{
BaseClassA parent;
...
}
// main.c
#include <BaseClassA.h>

继承

上面已经看到继承的基本代码了,但是为了实现标准的继承我们还要做的更多。

// BaseClassA
typedef struct BaseClass
{
int a;
void (*funcA)(struct BaseClassA*, int);
}BaseClassA;
// DerivedClassB
#include <BaseClassA.h>
typedef struct DerivedClassB
{
BaseClassA parent;
int special;
void (*funcA)(struct DerivedClassB*, int);
void (*funcB)(struct DerivedClassB*, int);
}DerivedClassB;
DerivedClassB* NewB()
{
DerivedClassB* newB = malloc(sizeof(DerivedClassB));
newB->parent.funcA = funcA;
newB->funcA = funcA;
return newB;
}

子类包含父类的实例,这样就实现了继承。子类同名函数重写了父类同名函数。为下面多态做好准备。

多态

接着上面的例子

BaseClassA* pBase = (BaseClassA*)NewB();
pBase->funcA(); // 调用的是子类构造函数中赋予的函数,这就是多态了

例子

%23C%u9762%u5411%u5BF9%u8C61%0A@%28myblog%29%5Bc/c++%2C%20%u9762%u5411%u5BF9%u8C61%5D%0A%u66FE%u7ECF%u8C08%u5230%u9762%u76F8%u5BF9%u8C61%uFF0C%u5C31%u662FC++%uFF0C%u4E4B%u540E%u6700%u591A%u52A0%u4E0AJava%u3002%u540E%u6765%uFF0C%u8BFB%u8FC7%u7684%u4E66%u4E0A%u8BF4%uFF0C%u9762%u5411%u5BF9%u8C61%u5176%u5B9E%u53EA%u662F%u4E00%u79CD%u8BBE%u8BA1%u65B9%u5F0F%u2014%u2014%u4E07%u4E8B%u5373%u5BF9%u8C61%uFF0C%u800C%u4E0D%u5728%u4E4E%u7528%u7684%u662F%u4EC0%u4E48%u8BED%u8A00%u3002C%u8BED%u8A00%uFF0C%u4E00%u6837%u4E5F%u53EF%u4EE5%u505A%u9762%u5411%u5BF9%u8C61%u8BBE%u8BA1%uFF0C%u6709%u65F6%u5019%u8FD8%u66F4%u81EA%u7531%uFF0C%u6CA1%u6709C++%u7684%u5404%u79CD%u9690%u85CF%u7684%u5751%u3002%u8FD9%u7BC7%u7B14%u8BB0%u5C31%u5177%u4F53%u8BF4%u8BF4%u5982%u4F55%u7528C%u6765%u505A%u9762%u5411%u5BF9%u8C61%u3002%0A%u5F00%u59CB%u4E4B%u524D%uFF0C%u7528%u4E00%u7BC7%u7F51%u4E0A%u6536%u5F55%u7684%u7B14%u8BB0%u4F5C%u4E3A%u5165%u95E8%uFF1A%5B%u7528C%u8BED%u8A00%u5199%u9762%u5411%u7684%u5BF9%u8C61%u662F%u4E00%u79CD%u4EC0%u4E48%u6837%u7684%u4F53%u9A8C%5D%28https%3A//app.yinxiang.com/shard/s10/nl/161681/29001a8f-6996-413e-b6ad-41c3cbba9c00%29%u3002%0A%0A%5BTOC%5D%0A%0A%23%23%u5C01%u88C5%0A%u5BF9%u7C7B%u7ED3%u6784%u7684%u5C01%u88C5%uFF0CC%u91CC%u9762%u7528%60struct%60%2C%u76F8%u6BD4C++%u7F3A%u5C11%u7684%u5C31%u662F%uFF1A%0A-%20%u6784%u9020%u51FD%u6570%u548C%u6790%u6784%u51FD%u6570%0A-%20private%20%26%20protect%0A%60%60%60%0A/**%20Class%20definition%20of%20BaseLed.%0A%20*%20%20The%20concrete%20LED%20class%20should%20inherit%20from%20this%20class.%0A%20*/%0Atypedef%20struct%20BaseLed%0A%7B%0A%20%20%20%20uint8_t%20id%3B%0A%20%20%20%20/**%0A%20%20%20%20%20*%20Below%204%20members%20are%20used%20to%20record%20the%20LED%20blinking%20state.%0A%20%20%20%20%20*/%0A%20%20%20%20LedState%20state%3B%20%20%20%20%20%20%20%20%20%0A%20%20%20%20uint16_t%20onTime%3B%0A%20%20%20%20uint16_t%20offTime%3B%0A%20%20%20%20uint16_t%20blinkCount%3B%0A%0A%20%20%20%20/**%0A%20%20%20%20%20*%20Public%20function%20members.%0A%20%20%20%20%20*/%0A%20%20%20%20bool%20%28*init%29%28struct%20BaseLed*%2C%20uint8_t%29%3B%20%20%20%20%20%20%20%20%20%20%20%20%20//%21%3C%20Init%20hardware%20if%20needed%0A%20%20%20%20void%20%28*switchit%29%28struct%20BaseLed*%2C%20LedState%29%3B%20%20%20%20%20%20%20%20//%21%3C%20Turn%20on/off%0A%20%20%20%20void%20%28*set_brightness%29%28struct%20BaseLed*%2C%20uint8_t%29%3B%20%20%20//%21%3C%20Set%20LED%27s%20brightness%20level%0A%7DBaseLed%3B%0A%60%60%60%0A%u4E0A%u9762%u7684%u4F8B%u5B50%uFF0C%u4E0A%u534A%u90E8%u5206%u662F%u6210%u5458%u53D8%u91CF%uFF0C%u4E0B%u534A%u90E8%u5206%u662F%u6210%u5458%u51FD%u6570%u3002%0A%23%23%23%u6784%u9020%u51FD%u6570%26%u6790%u6784%u51FD%u6570%0AC++%u4E2D%uFF0C%u6784%u9020%u51FD%u6570%u5728%u4EE5%u4E0B%u60C5%u51B5%u4E2D%u8C03%u7528%uFF1A%0A%3E*%20%u6BD4%u5982%u4F60%u5728main%u91CC%u9762%u58F0%u660E%u4E86%u4E00%u4E2A%u7C7BA..%u90A3%u4E48%7EA%28%29%u4F1A%u5728main%u7ED3%u675F%u65F6%u8C03%u7528%0A%3E*%20%u5982%u679C%u5728%u81EA%u5B9A%u4E49%u7684%u51FD%u6570f%28%29%u91CC%u9762%u58F0%u660E%u4E86%u4E00%u4E2AA%20%20%u51FD%u6570f%u7ED3%u675F%u7684%u65F6%u5019%u5C31%u4F1A%u8C03%u7528%7EA%28%29%0A%3E*%20%u6216%u8005%u4F60delete%20%u6307%u5411A%u7684%u6307%u9488..%0A%3E*%20%u6216%u8005%u663E%u5F0F%u7684%u8C03%u7528%u6790%u6784%u51FD%u6570%0A%0AC++%u5404%u79CD%u9690%u85CF%u7684%u5751%uFF0C%u53C2%u8003%u7F51%u9875%5B%u90A3%u4E9B%u88ABC++%u9ED8%u9ED8%u5730%u58F0%u660E%u548C%u8C03%u7528%u7684%u51FD%u6570%5D%28https%3A//app.yinxiang.com/shard/s10/nl/161681/86f265a4-1139-468b-a941-1897a54ae440%29%0A%0A%u5728C%u9762%u5411%u5BF9%u8C61%u91CC%u9762%uFF0C%u5C31%u6CA1%u6709%u8FD9%u79CD%u95EE%u9898%uFF0C%u56E0%u4E3AC%u8BED%u8A00%u6839%u672C%u4E0D%u4F1A%u4E3A%u4F60%u751F%u6210%u9690%u85CF%u7684%u51FD%u6570%uFF0C%u4E5F%u4E0D%u4F1A%u5E2E%u4F60%u8C03%u7528%u6784%u9020%u51FD%u6570%uFF0C%u5206%u914D%u5185%u5B58%u4E4B%u7C7B%u7684%uFF0C%u4E5F%u6CA1%u6709%u865A%u51FD%u6570%u8868%28VTable%29%u3002%u6240%u6709%u8FD9%u4E00%u5207%u90FD%u5F97%u81EA%u5DF1%u4EB2%u529B%u4EB2%u4E3A%u3002%u4E0D%u8FC7%u8FD9%u6837%u4E5F%u597D%uFF0C%u6240%u89C1%u5373%u6240%u5F97%u561B%u3002%0A%60%60%60%0A//%20ClassA.h%0Atypedef%20struct%20_ClassA%0A%7B%0A%09int%20a%3B%0A%09void%20%28*funcA%29%28struct%20_ClassA*%2C%20int%29%3B%0A%7DClassA%3B%0A%0AClassA*%20newA%28int%29%3B%0Avoid%20deleteA%28ClassA*%29%0A%0A//%20ClassA.c%0A%23include%20%3CClassA.h%3E%0Avoid%20funcA%28ClassA*%20thisObj%2C%20int%20a%29%0A%7B%0A%7D%0A%0AClassA*%20newA%28int%20a%29%0A%7B%0A%09ClassA*%20Aobj%20%3D%20malloc%28sizeof%28ClassA%29%29%3B%0A%09Aobj-%3EfuncA%20%3D%20funcA%3B%0A%09Aobj-%3Ea%20%3D%20a%3B%0A%7D%0A%0Avoid%20deleteA%28ClassA*%20obj%29%0A%7B%0A%09free%28obj%29%3B%0A%7D%0A//%20main.c%0A%23include%20%3CClassA.h%3E%0Avoid%20main%28%29%0A%7B%0A%09ClassA*%20aobj%20%3D%20newA%280%29%3B%0A%7D%0A%60%60%60%0A%u5728%u5934%u6587%u4EF6%u4E2D%u5B9A%u4E49%u7ED3%u6784%u4F53%uFF0C%u5728.c%u6587%u4EF6%u4E2D%u5B9A%u4E49%u6210%u5458%u51FD%u6570%uFF0C%u5728%u9700%u8981%u8C03%u7528%u7684%u5730%u65B9include%u5BF9%u5E94%u7684%u5934%u6587%u4EF6%u3002%u6CE8%u610F%uFF0C%u5F53%60main%60%u51FD%u6570%u548CClassA%u5B9A%u4E49%u5728%u4E0D%u540C%u7684lib%u4E2D%u65F6%uFF0C%u5934%u6587%u4EF6%u4E2D%u7684newA%2C%20deleteA%u9700%u8981%u5B9A%u4E49%u4E3Aextern%u3002%0A%0A%23%23%23%u8BBF%u95EE%u63A7%u5236%u2014%u2014private%20%26%20protect%0A%u4E0A%u9762%u7684%u4F8B%u5B50%u662F%u6700%u57FA%u672C%u7684%u5C01%u88C5%u5B9A%u4E49%u3002C++%u4E2D%u6700%u6709%u8DA3%u7684%u662F%u8BBF%u95EE%u63A7%u5236%uFF0C%u5373%u9690%u85CF%u5177%u4F53%u5B9E%u73B0%uFF0C%u800C%u53EA%u66B4%u9732%u63A5%u53E3%u3002%0A%23%23%23%23private%0A%u600E%u4E48%u505A%u5230private%u5462%uFF1F%u7528%u7ED3%u6784%u4F53%u628Aprivate%u5305%u8D77%u6765%u3002%0A%60%60%60cpp%0A//%20ClassA.c%0A%23include%20%3CClassA_Private.h%3E%0A%23include%20%3CClassA.h%3E%0AClassA*%20newA%28int%20a%29%0A%7B%0A%09ClassA*%20pA%20%3D%20malloc%28sizeof%28ClassA%29%29%3B%0A%09pA-%3Eprivate_data%20%3D%20malloc%28sizeof%28ClassA_Private%29%29%3B%0A%09ClassA_Private*%20pPrivate%20%3D%20%28ClassA_Private*%29%28pA-%3Eprivate_data%29%3B%0A%09pPrivate-%3Eprivate_a%20%3D%20a%3B%0A%7D%0A...%0A%60%60%60%0A%60%60%60%0A//%20ClassA.h%0Atypedef%20struct%20_ClassA%0A%7B%0A%09int%20a%3B%0A%09void%20%28*funcA%29%28struct%20_ClassA*%2C%20int%29%3B%0A%09void*%20private_data%3B%20%20//%20void%20pointer%20hide%20all%20details%0A%7DClassA%3B%0A%60%60%60%0A%60%60%60%0A//%20ClassA_Private.h%0Astruct%20_ClassA%3B%0Atypedef%20_ClassA_Private%0A%7B%0A%09int%20private_a%3B%0A%09void%20%28*private_funcA%29%28struct%20_ClassA*%2C%20int%29%3B%0A%7DClassA_Private%0A%60%60%60%0A%0A%60%60%60%0A//%20main.c%0A%23include%20%3CClassA.h%3E%0Avoid%20main%28%29%0A%7B%0A%09ClassA*%20obj%20%3D%20newA%280%29%3B%0A%09//%20Below%20line%20will%20bring%20compile%20error.%0A%09//%20obj-%3Eprivate_data.private_funcA%0A%7D%0A%60%60%60%0A%u5C06%u4E0D%u60F3%u66B4%u9732%u7684%u6570%u636E%u653E%u5728%u4E00%u4E2A%u7ED3%u6784%u4F53%u4E2D%uFF0C%u7C7B%u5B9A%u4E49%u7684%u5934%u6587%u4EF6%u5305%u542B%u8BE5%u5934%u6587%u4EF6%u3002%u5BF9%u4E8E%u60F3%u9690%u85CF%u7684%u5BA2%u6237%u4EE3%u7801%uFF0C%u4E0D%u66B4%u9732%u8BE5%u5934%u6587%u4EF6%uFF0C%u5219%u5176%u4E0D%u80FD%u5F15%u7528%u8BE5private%u4EE3%u7801%u3002%0A%23%23%23protect%0Aprotect%u6570%u636E%u662F%u53EA%u80FD%u66B4%u9732%u7ED9%u5B50%u7C7B%u548C%u53CB%u5143%uFF0C%u5173%u4E8E%u53CB%u5143%uFF0C%u53C2%u8003%5B%u3010C++%u57FA%u7840%u4E4B%u5341%u3011%u53CB%u5143%u51FD%u6570%u548C%u53CB%u5143%u7C7B%5D%28https%3A//app.yinxiang.com/shard/s10/nl/161681/b14c761f-a0d8-4c31-b54c-fb0f4a369aae%29%0A%u5728C%u91CC%u9762%uFF0C%u5C31%u6CA1%u8FD9%u4E48%u591A%u82B1%u5934%u4E86%uFF0C%u201C%u5BF9%u4E8E%u60F3%u9690%u85CF%u7684%u5BA2%u6237%u4EE3%u7801%uFF0C%u4E0D%u66B4%u9732%u8BE5%u5934%u6587%u4EF6%u201D%u3002%0A%60%60%60%0A//%20ChildClassA.h%0A%23include%20%3CBaseClassA.h%3E%0A%23include%20%3CBaseClassA_Private.h%3E%0Atypedef%20struct%20_ChildClassA%0A%7B%0A%09BaseClassA%20parent%3B%0A%09...%0A%7D%0A%0A//%20FriendClassA.h%0A%23include%20%3CBaseClassA.h%3E%0A%23include%20%3CBaseClassA_Private.h%3E%0Atypedef%20struct%20_ChildClassA%0A%7B%0A%09BaseClassA%20parent%3B%0A%09...%0A%7D%0A%0A//%20main.c%0A%23include%20%3CBaseClassA.h%3E%0A%60%60%60%0A%23%23%u7EE7%u627F%0A%u4E0A%u9762%u5DF2%u7ECF%u770B%u5230%u7EE7%u627F%u7684%u57FA%u672C%u4EE3%u7801%u4E86%uFF0C%u4F46%u662F%u4E3A%u4E86%u5B9E%u73B0%u6807%u51C6%u7684%u7EE7%u627F%u6211%u4EEC%u8FD8%u8981%u505A%u7684%u66F4%u591A%u3002%0A%60%60%60%0A//%20BaseClassA%0Atypedef%20struct%20BaseClass%0A%7B%0A%09int%20a%3B%0A%09void%20%28*funcA%29%28struct%20BaseClassA*%2C%20int%29%3B%0A%7DBaseClassA%3B%0A%0A//%20DerivedClassB%0A%23include%20%3CBaseClassA.h%3E%0Atypedef%20struct%20DerivedClassB%0A%7B%0A%09BaseClassA%20parent%3B%0A%09int%20special%3B%0A%09void%20%28*funcA%29%28struct%20DerivedClassB*%2C%20int%29%3B%0A%09void%20%28*funcB%29%28struct%20DerivedClassB*%2C%20int%29%3B%0A%7DDerivedClassB%3B%0A%0ADerivedClassB*%20NewB%28%29%0A%7B%0A%09DerivedClassB*%20newB%20%3D%20malloc%28sizeof%28DerivedClassB%29%29%3B%0A%09newB-%3Eparent.funcA%20%3D%20funcA%3B%0A%09newB-%3EfuncA%20%3D%20funcA%3B%0A%09return%20newB%3B%0A%7D%0A%60%60%60%0A%u5B50%u7C7B%u5305%u542B%u7236%u7C7B%u7684%u5B9E%u4F8B%uFF0C%u8FD9%u6837%u5C31%u5B9E%u73B0%u4E86%u7EE7%u627F%u3002%u5B50%u7C7B%u540C%u540D%u51FD%u6570%u91CD%u5199%u4E86%u7236%u7C7B%u540C%u540D%u51FD%u6570%u3002%u4E3A%u4E0B%u9762%u591A%u6001%u505A%u597D%u51C6%u5907%u3002%0A%23%23%u591A%u6001%0A%u63A5%u7740%u4E0A%u9762%u7684%u4F8B%u5B50%0A%60%60%60%0ABaseClassA*%20pBase%20%3D%20%28BaseClassA*%29NewB%28%29%3B%0ApBase-%3EfuncA%28%29%3B%20//%20%u8C03%u7528%u7684%u662F%u5B50%u7C7B%u6784%u9020%u51FD%u6570%u4E2D%u8D4B%u4E88%u7684%u51FD%u6570%uFF0C%u8FD9%u5C31%u662F%u591A%u6001%u4E86%0A%60%60%60%0A%23%23%u4F8B%u5B50%0A

Edit


Python太强大了,使用其Win32库可以操作所有的Windows应用程序。例如我们在做Dashboard生成器的时候,使用win32com来操作Excel和PPT来获得数据,并更新PPT文件。在初期,我们采用python的xlrd和pptx来操作Excel和PPT,但是发现功能局限性很大,在后来的开发中,我发现xlrd和pptx不仅功能不行,效率也极其低下,远远慢于win32com。唯一的好处就是,其可以跨平台,而不依赖于Windows操作系统。但这一点对我毫无吸引力。

网上使用win32com的文档不是很多,找到下面这篇,讲操作PPT的,入门还是不错的

其他的可以参考微软的VBA文档,每一个COM object都有其对应的属性列表和方法列表。

一些经验

  1. 大多数COM中的索引值都是从1开始
  2. Excel分为Workbook->Worksheet->ListObjects / ChartObjects

其实可以单独写一篇如何操作PPT和EXCEL,苦于没有时间
EXCEL里面有一些核心概念,都是通过PDB尝试才慢慢得来。例如Range,AutoFIlter,ListObject。先列在这儿,待有时间再慢慢补充。

%23Python%20win32com%0A@%28%u5B66%u4E60%u7B14%u8BB0%29%5Bpython%2C%20win32com%5D%0APython%u592A%u5F3A%u5927%u4E86%uFF0C%u4F7F%u7528%u5176Win32%u5E93%u53EF%u4EE5%u64CD%u4F5C%u6240%u6709%u7684Windows%u5E94%u7528%u7A0B%u5E8F%u3002%u4F8B%u5982%u6211%u4EEC%u5728%u505ADashboard%u751F%u6210%u5668%u7684%u65F6%u5019%uFF0C%u4F7F%u7528win32com%u6765%u64CD%u4F5CExcel%u548CPPT%u6765%u83B7%u5F97%u6570%u636E%uFF0C%u5E76%u66F4%u65B0PPT%u6587%u4EF6%u3002%u5728%u521D%u671F%uFF0C%u6211%u4EEC%u91C7%u7528python%u7684xlrd%u548Cpptx%u6765%u64CD%u4F5CExcel%u548CPPT%uFF0C%u4F46%u662F%u53D1%u73B0%u529F%u80FD%u5C40%u9650%u6027%u5F88%u5927%uFF0C%u5728%u540E%u6765%u7684%u5F00%u53D1%u4E2D%uFF0C%u6211%u53D1%u73B0xlrd%u548Cpptx%u4E0D%u4EC5%u529F%u80FD%u4E0D%u884C%uFF0C%u6548%u7387%u4E5F%u6781%u5176%u4F4E%u4E0B%uFF0C%u8FDC%u8FDC%u6162%u4E8Ewin32com%u3002%u552F%u4E00%u7684%u597D%u5904%u5C31%u662F%uFF0C%u5176%u53EF%u4EE5%u8DE8%u5E73%u53F0%uFF0C%u800C%u4E0D%u4F9D%u8D56%u4E8EWindows%u64CD%u4F5C%u7CFB%u7EDF%u3002%u4F46%u8FD9%u4E00%u70B9%u5BF9%u6211%u6BEB%u65E0%u5438%u5F15%u529B%u3002%0A%0A%u7F51%u4E0A%u4F7F%u7528win32com%u7684%u6587%u6863%u4E0D%u662F%u5F88%u591A%uFF0C%u627E%u5230%u4E0B%u9762%u8FD9%u7BC7%uFF0C%u8BB2%u64CD%u4F5CPPT%u7684%uFF0C%u5165%u95E8%u8FD8%u662F%u4E0D%u9519%u7684%0A-%20%5B%5BPython%5D%20win32com%20PowerPoint%20%AB%20ibluemonkey%27s%20Note%5D%28https%3A//app.yinxiang.com/shard/s10/nl/161681/d5ac6ed2-19a4-435d-9b98-69db9ae0107c%29%0A%0A%u5176%u4ED6%u7684%u53EF%u4EE5%u53C2%u8003%u5FAE%u8F6F%u7684VBA%u6587%u6863%uFF0C%u6BCF%u4E00%u4E2ACOM%20object%u90FD%u6709%u5176%u5BF9%u5E94%u7684%u5C5E%u6027%u5217%u8868%u548C%u65B9%u6CD5%u5217%u8868%u3002%0A-%20%5B%u5BF9%u8C61%u6A21%u578B%uFF08PowerPoint%20VBA%20%u53C2%u8003%uFF09%5D%28https%3A//msdn.microsoft.com/zh-cn/library/office/ff743835.aspx%29%0A%0A%23%23%u4E00%u4E9B%u7ECF%u9A8C%0A1.%20%u5927%u591A%u6570COM%u4E2D%u7684%u7D22%u5F15%u503C%u90FD%u662F%u4ECE1%u5F00%u59CB%0A2.%20Excel%u5206%u4E3AWorkbook-%3EWorksheet-%3EListObjects%20/%20ChartObjects%0A%0A%u5176%u5B9E%u53EF%u4EE5%u5355%u72EC%u5199%u4E00%u7BC7%u5982%u4F55%u64CD%u4F5CPPT%u548CEXCEL%uFF0C%u82E6%u4E8E%u6CA1%u6709%u65F6%u95F4%0AEXCEL%u91CC%u9762%u6709%u4E00%u4E9B%u6838%u5FC3%u6982%u5FF5%uFF0C%u90FD%u662F%u901A%u8FC7PDB%u5C1D%u8BD5%u624D%u6162%u6162%u5F97%u6765%u3002%u4F8B%u5982Range%uFF0CAutoFIlter%uFF0CListObject%u3002%u5148%u5217%u5728%u8FD9%u513F%uFF0C%u5F85%u6709%u65F6%u95F4%u518D%u6162%u6162%u8865%u5145%u3002