要在R中列出的文本文件

我有一个大文本文件,每行可变数量的字段。每行中的第一个条目对应于生物学途径,并且每个后续条目对应于该途径中的基因。前几行可能看起来像这样

path1   gene1 gene2
path2   gene3 gene4 gene5 gene6
path3   gene7 gene8 gene9

我需要将这个文件作为列表读入R,每个元素都是一个字符向量,并且列表中每个元素的名称都是行上的第一个元素,例如:

> pathways <- list(
+     path1=c("gene1","gene2"), 
+     path2=c("gene3","gene4","gene5","gene6"),
+     path3=c("gene7","gene8","gene9")
+ )
> 
> str(pathways)
List of 3
 $ path1: chr [1:2] "gene1" "gene2"
 $ path2: chr [1:4] "gene3" "gene4" "gene5" "gene6"
 $ path3: chr [1:3] "gene7" "gene8" "gene9"
> 
> str(pathways$path1)
 chr [1:2] "gene1" "gene2"
> 
> print(pathways)
$path1
[1] "gene1" "gene2"

$path2
[1] "gene3" "gene4" "gene5" "gene6"

$path3
[1] "gene7" "gene8" "gene9"

…但我需要自动为数千行。我看到一个similar question posted here previously,但我不知道如何从这个线程做到这一点。

提前致谢。

这里有一种方法:

# Read in the data
x <- scan("data.txt", what="", sep="\n")
# Separate elements by one or more whitepace
y <- strsplit(x, "[[:space:]]+")
# Extract the first vector element and set it as the list element name
names(y) <- sapply(y, `[[`, 1)
#names(y) <- sapply(y, function(x) x[[1]]) # same as above
# Remove the first vector element from each list element
y <- lapply(y, `[`, -1)
#y <- lapply(y, function(x) x[-1]) # same as above
http://stackoverflow.com/questions/6602881/text-file-to-list-in-r

本站文章除注明转载外,均为本站原创或编译
转载请明显位置注明出处:要在R中列出的文本文件