1 Segmentation
(Author: Adam Przepiórkowski)
Tags are assigned to segments (tokens, roughly – words). Segments are not longer than orthographic
words (‘from space to space’), but sometimes segments are shorter than orthographic words:
The segmentation principles given above lead to the segmentation of 1. (translated into English in 2.)
that is presented in 3.
- Pojechalibyśmy z Janem M. Rokitą i Janem Nowakiem-Jeziorańskim na sesję
polsko-amerykańską, gdyby nas zaprosił George W. Byłaby to nasza już 2. doń podróż od
czasów PRL-u, a może i 3., czy nawet 4.
- ‘We would go with Jan M. Rokita and Jan Nowak-Jeziorański to the Polish-American
session, if we were invited by George W. That would already be our 2nd trip to him since
the times of PRL, and perhaps 3rd, or even 4th.’
- [Pojechali][by][śmy] [z] [Janem] [M.] [Rokitą] [i] [Janem] [Nowakiem][-][Jeziorańskim] [na]
[sesję] [polsko][-][amerykańską][,] [gdyby] [nas] [zaprosił] [George] [W][.] [Była][by] [to]
[nasza] [już] [2.] [do][ń] [podróż] [od] [czasów] [PRL-u][,] [a] [może] [i] [3.][,] [czy] [nawet]
[4][.]