1 Segmentation

(Author: Adam Przepiórkowski)

Tags are assigned to segments (tokens, roughly – words). Segments are not longer than orthographic words (‘from space to space’), but sometimes segments are shorter than orthographic words:

The segmentation principles given above lead to the segmentation of 1. (translated into English in 2.) that is presented in 3.

  1. Pojechalibyśmy z Janem M. Rokitą i Janem Nowakiem-Jeziorańskim na sesję polsko-amerykańską, gdyby nas zaprosił George W. Byłaby to nasza już 2. doń podróż od czasów PRL-u, a może i 3., czy nawet 4.
  2. ‘We would go with Jan M. Rokita and Jan Nowak-Jeziorański to the Polish-American session, if we were invited by George W. That would already be our 2nd trip to him since the times of PRL, and perhaps 3rd, or even 4th.’
  3. [Pojechali][by][śmy] [z] [Janem] [M.] [Rokitą] [i] [Janem] [Nowakiem][-][Jeziorańskim] [na] [sesję] [polsko][-][amerykańską][,] [gdyby] [nas] [zaprosił] [George] [W][.] [Była][by] [to] [nasza] [już] [2.] [do][ń] [podróż] [od] [czasów] [PRL-u][,] [a] [może] [i] [3.][,] [czy] [nawet] [4][.]