Problematic of the sliding window in general NN:
- Like Markov it limits the context from which information can be extracted (limits to window area)
- Window makes it difficult to learn systematic patterns arising from phenomena like constituency
RNN is a class of networks designed to address these problems by processing sequences explicitly as sequences, allowing us to handle variable length inputs without the use of arbitrary fixed-sized windows.