tm - How to convert corpus to data.frame with meta data in R -


how can convert corpus data frame in r contains meta data? tried suggestion convert corpus data.frame in r, resulting data frame contains text lines docs in corpus. need document id , maybe line number of text line in 2 columns. so, how can extend command: dataframe <- data.frame(text=unlist(sapply(mycorpus,[, "content")), stringsasfactors=false) data?

i tried

    dataframe <-  data.frame(id=sapply(corpus, meta(corpus, "id")),  text=unlist(sapply(corpus, `[`, "content")),  stringsasfactors=f) 

but didn't help; got error message "error in match.fun(fun) : 'meta(corpus, "id")' ist nicht funktion, zeichen oder symbol"

the corpus extracted plain text files; here example:

> str(corpus) [...] $ 1178531510 :list of 2   ..$ content: chr [1:67] " uberrasch sagt [...] gemacht echt schad verursacht" ...   ..$ meta   :list of 7   .. ..$ author       : chr(0)    .. ..$ datetimestamp: posixlt[1:1], format: "2015-08-16 14:44:11"   .. ..$ description  : chr(0)    .. ..$ heading      : chr(0)    .. ..$ id           : chr "1178531510" # <--- id want in data.frame   .. ..$ language     : chr "de"   .. ..$ origin       : chr(0)    .. ..- attr(*, "class")= chr "textdocumentmeta"   ..- attr(*, "class")= chr [1:2] "plaintextdocument" "textdocument" [...] 

many in advance :)

there 2 problems : should not repeat argument corpus in sapply, , multi-paragraphs texts turned character vectors of length > 1 should paste before unlisting.

dataframe <-      data.frame(id=sapply(corpus, meta, "id"),                text=unlist(lapply(sapply(corpus, '[', "content"),paste,collapse="\n")),                stringsasfactors=false) 

Comments

Popular posts from this blog

dns - How To Use Custom Nameserver On Free Cloudflare? -

python - Pygame screen.blit not working -

c# - Web API response xml language -