python – 在字符串列表中计算单词出现次数

如何计算单词出现在字符串列表中的次数?

例如:

['This is a sentence', 'This is another sentence']

而“句子”这个词的结果是2

使用collections.Counter()对象并在空格上拆分单词.你可能也希望小写你的单词,并删除标点符号:

from collections import Counter

counts = Counter()

for sentence in sequence_of_sentences:
    counts.update(word.strip('.,?!"\'').lower() for word in sentence.split())

或者使用只匹配单词字符的正则表达式:

from collections import Counter
import re

counts = Counter()
words = re.compile(r'\w+')

for sentence in sequence_of_sentences:
    counts.update(words.findall(sentence.lower()))

现在你有一个包含每个字数的计数字典.

演示:

>>> sequence_of_sentences = ['This is a sentence', 'This is another sentence']
>>> from collections import Counter
>>> counts = Counter()
>>> for sentence in sequence_of_sentences:
...     counts.update(word.strip('.,?!"\'').lower() for word in sentence.split())
... 
>>> counts
Counter({'this': 2, 'is': 2, 'sentence': 2, 'a': 1, 'another': 1})
>>> counts['sentence']
2
https://stackoverflow.com/questions/18231542/count-word-occurrence-in-a-list-of-strings

本站文章除注明转载外,均为本站原创或编译
转载请明显位置注明出处:python – 在字符串列表中计算单词出现次数