Given a string, find the length of the longest substring without repeating characters.
《Elasticsearch 权威指南》之基础入门 Note(基于 7.x)
目录
Elasticsearch 建立在 Lucene 上,它不仅仅是一个全文搜索引擎:
- 一个分布式的实时文档存储,每个字段 可以被索引与搜索
- 一个分布式实时分析搜索引擎
- 能胜任上百个服务节点的扩展,并支持 PB 级别的结构化或者非结构化数据
Install Elasticsearch with Docker | Elasticsearch Reference [7.2] | Elastic
-
docker run -p 9200:9200 -p 9300:9300 -e "discovery.type=single-node" docker.elastic.co/elasticsearch/elasticsearch:7.2.0 -
curl http://127.0.0.1:9200/_cat/health -
curl 'http://localhost:9200/?pretty'
Running Kibana on Docker | Kibana User Guide [7.2] | Elastic
-
docker run --link YOUR_ELASTICSEARCH_CONTAINER_NAME_OR_ID:elasticsearch -p 5601:5601 {docker-repo}:{version}
存储数据到 Elasticsearch 的行为叫做 索引,但在索引一个文档之前,需要确定将文档存储在哪里。
一个 Elasticsearch 集群可以 包含多个 索引 ,相应的每个索引可以包含多个 类型 。 这些不同的类型存储着多个 文档 ,每个文档又有 多个 属性 。
Syntactic Parsing Note (SLP Ch13)
This chapter focuses on the structures assigned by context-free grammars. Context-free grammars don’t specify how the parse tree for a given sentence should be computed. We therefore need to specify algorithms that employ these grammars to efficiently produce correct trees. They are useful in applications such as grammar checking, semantic analysis, question answering and information extraction.
Formal Grammars of English Note (SLP Ch12)
This chapter is devoted to the topic of context-free grammars. They are integral to many computational applications, including grammar checking, semantic interpretation, dialogue understanding, and machine translation.
信息熵与选择:由三门问题想到的
Sequence Processing with Recurrent Networks Note (SLP Ch09)
Problematic of the sliding window in general NN:
- Like Markov it limits the context from which information can be extracted (limits to window area)
- Window makes it difficult to learn systematic patterns arising from phenomena like constituency
RNN is a class of networks designed to address these problems by processing sequences explicitly as sequences, allowing us to handle variable length inputs without the use of arbitrary fixed-sized windows.
常用 DataBase 相关操作和资源
Part-of-Speech Tagging Note (SLP Ch08)
Parts-of-speech (also known as POS, word classes, or syntactic categories) are useful because they reveal a lot about a word and its neighbors. Useful for:
-
labeling named entities
-
coreference resolution
-
speech recognition or synthesis
一些关于工作的观点(From《华为工作法》)
武夷山旅途期间花了 3 个小时阅读了这本书,虽然本书以观念为主,且部分观点间界限比较模糊甚至有冲突,但却挺对自己的胃口,我一直也想让自己能像书中观点所说的那样工作,所以就整理了一下以便时常查阅。
Neural Networks and Neural Language Models Note (SLP Ch07)
Units
In practice, the sigmoid is not commonly used as an activation function. A better one is tanh function ranges from -1 to 1: $$y = \frac{e^z - e^{-z}}{e^z + e^{-z}}$$
The most commonly used is the rectified linear unit, also called ReLU: y = max(x, 0)
In the sigmoid or tanh functions, very high values of z result in values of y that are saturated, extremely close to 1, which causes problems for learning.
- Rectifiers don’t have this problem, since the output of values close to 1 also approaches 1 in a nice gentle linear way.
- By contrast, the tanh function has the nice properties of being smoothly differentiable and mapping outlier values toward the mean.