問題描述
(Python)拆分字符串多個分隔符更有效?1) 使用多重替換方法然後使用拆分 2) 使用正則表達式 ((Python) which is more efficient to split a string multiple separators? 1) Using multiple replace method then using split 2) using regular Expressions)
示例:
(案例1)
#先用replace方法將不同類型的分隔符替換為單一類型,然後再用split方法
p>text = "python is, an easy;language; to, learn."
text_one_delimiter = text.replace("# ", ", ").replace("% ", ", ").replace("; ", ", ").replace("‑ ", ", ")
print(text_one_delimiter.split(", "))
(case 2)
#使用正則表達式進行分割使用多個分隔符
import re
text = "python is# an% easy;language‑ to, learn."
print(re.split('; |, |# |% |‑ ', text))
參考解法
方法 1:
timeit module is useful for speed comparison of code snippet. It might be used following way:
import timeit
case1 = '''text = "python is, an easy;language; to, learn."
text_one_delimiter = text.replace("# ", ", ").replace("% ", ", ").replace("; ", ", ").replace("‑ ", ", ")
text_one_delimiter.split(", ")'''
case2_setup = "import re"
case2 = '''text = "python is# an% easy;language‑ to, learn."
re.split('; |, |# |% |‑ ', text)'''
print(timeit.timeit(case1))
print(timeit.timeit(case2,case2_setup))
Output (will depend on your machine):
1.1250261999999793
2.2901268999999616
Note that I excluded print
s from examined code and make import re
setup, as otherwise it would import it without need several time. Conclusion is that in this particular case method with multiple .replace
s is faster than re.split
.
(tested in Python 3.7.3)
(by Shambhav Agrawal、Daweo)