問題描述
在 Python 列表中查找位於特定字符串之間的字符串 (Find Strings Located Between Specific Strings in List Python)
我正在編寫從網站中提取數據的代碼,並打印出特定標籤之間的所有文本。每次代碼從標籤中提取數據時,我都會將結果存儲到一個列表中,所以我有一個類似於
Warning
Not
News
Legends
Name1
Name2
Name3
Pickle
Stop
Hello
我想查看這個字符串列表並擁有可以找到的代碼關鍵字 legends
和 pickle
並打印它們之間的任何字符串。
為了在進一步的活動中詳細說明,我可能會創建所有可能的完整列表 legend names
,然後,如果它們在我生成列表時出現,則打印出那些再次出現的。對這些問題有任何見解嗎?
參考解法
方法 1:
You can use the list.index()
method to find the numerical index of an item within a list, and then use list slicing to return the items in your list between those two points:
your_list = ['Warning','Not','News','Legends','Name1','Name2','Name3','Pickle','Stop','Hello']
your_list[your_list.index('Legends')+1:your_list.index('Pickle')]
The caveat is that .index()
returns only the index of the first occurrence of the given item, so if your list has two 'legends' items, you'll only return the first index.
方法 2:
Try this:
words = [
"Warning", "Not", "News", "Legends", "Name1",
"Name2", "Name3", "Pickle", "Stop", "Hello"
]
words_in_between = words[words.index("Legends") + 1:words.index("Pickle")]
print(words_in_between)
output:
['Name1', 'Name2', 'Name3']
This assumes that both "Legends"
and "Pickle"
are in the list exactly once.
方法 3:
For the second approach, you could create a regex alternation of expected matching names, then use a list comprehension to generate a list of matches:
tags = ['Warning', 'Not', 'News', 'Legends', 'Name1', 'Name2', 'Name3', 'Pickle', 'Stop', 'Hello']
names = ['Name1', 'Name2', 'Name3']
regex = r'^(?:' + r'|'.join(names) + r')$'
matches = [x for x in tags if re.search(regex, x)]
print(matches) # ['Name1', 'Name2', 'Name3']
方法 4:
You can use list.index()
to get the index of the first occurance of legends
and pickle
. Then you can use list slicing
to get the elements in between
l = ['Warning','Not','News','Legends','Name1','Name2','Name3','Pickle','Stop','Hello']
l[l.index('Legends')+1 : l.index('Pickle')]
['Name1', 'Name2', 'Name3']
方法 5:
numpys function where gives you all occurances of a given item. So first make the lsit a numpy array
my_array = numpy.array(["Warning","Not","News","Legends","Name1","Name2","Name3","Pickle","Stop","Hello","Legends","Name1","Name2","Name3","Pickle",])
From here on you can use methods of numpy:
legends = np.where(my_array == "Legends")
pickle = np.where(my_array == "Pickle")
concatinating for easier looping
stack = np.concatenate([legends, pickle], axis=0)
look for the values between legends and pickle
np.concatenate([my_list[stack[0, i] + 1:stack[1, i]] for i in range(stack.shape[0])] )
The result in my case is:
array(['Name1', 'Name2', 'Name3', 'Name1', 'Name2'], dtype='<U7')
(by triplecute、PeptideWitch、sarartur、Tim Biegeleisen、Epsi95、thomas)