問題描述
過濾Python列表時出現意外輸出:我做錯了什麼? (Unexpected output when filtering Python list: What am I doing wrong?)
我正在嘗試過濾一個列表,代碼如下:
test=['aaa','bbb','ccc','ddd','footer','header']
def rm_hf(x): return x != 'footer'
filter(rm_hf,test)
結果:
>>>['aaa','bbb','ccc','ddd','header']
這是預期的結果,在列表中找到“頁腳”並將其刪除。
現在我想同時刪除“頁眉”和“頁腳”,所以我這樣做:
test2=['aaa','bbb','ccc','ddd','footer','header']
def rm_hf2(x): return x != 'footer' or x != 'header'
filter(rm_hf2,test2)
結果:
>>>['aaa','bbb','ccc','ddd','footer','header']
現在這很奇怪,它只是給出了'footer','header'而不是過濾它們?
我做錯了什麼?我覺得我的邏輯是對的……
參考解法
方法 1:
Your logic is correct because you think like a human. Your computer does not. He reads every element from your list, then stumbles upon 'footer'. "Is footer different than footer?", he says. "NO! It's the same string! It's evaluated to false. Let's see the next condition". "Is footer different than header? YES!" The condition is therefore False or True
, which obviously evaluates to true.
You want a and
, not a or
:
def rm_hf2(x): return x != 'footer' and x != 'header'
You could also use a tuple and the in
keyword, which is more readable :
def rm_hf2(x): return x not in ('footer', 'header')
It's important that you understand what's really going on with "and" and "or", though. And let's be honest : if something isn't working as you think it should, the problem most likely lies in your own code, and not in Python language itself.
方法 2:
my logic is correct
Actually, no it isn't, as highlighted in other answers.
A far neater way to achieve the desired outcome is to use list comprehensions, viz:
test = ['aaa', 'bbb', 'ccc', 'ddd', 'footer', 'header']
undesirable = ['footer', 'header']
[_ for _ in test if _ not in undesirable]
From the documentation:
Note that
filter(function, iterable)
is equivalent to[item for item in iterable if function(item)]
iffunction
is notNone
and[item for item in iterable if item]
iffunction
isNone
.
That said, there's no time like the present to brush‑up on your Boolean logic!
Were you to unit test your code, you would quickly find out that your second filtration function is not doing what you expect. Here is a simplistic example:
$ cat 4281875.py
#!/usr/bin/env python
import unittest
def rm_hf2(x): return x != 'footer' or x != 'header'
class test_rm_hft(unittest.TestCase):
def test_aaa_is_not_filtered(self):
self.assertTrue(rm_hf2('aaa'))
def test_footer_is_filtered_out(self):
self.assertFalse(rm_hf2('footer'))
if __name__ == '__main__':
unittest.main()
$ ./4281875.py
.F
======================================================================
FAIL: test_footer_is_filtered_out (__main__.test_rm_hft)
‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑
Traceback (most recent call last):
File "./4281875.py", line 13, in test_footer_is_filtered_out
self.assertFalse(rm_hf2('footer'))
AssertionError
‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑‑
Ran 2 tests in 0.000s
FAILED (failures=1)
方法 3:
What everybody else said, plus:
When you have several items that you want to exclude, use a set
instead of a chain of and
s or a tuple
:
# do once
blacklist = set(['header', 'footer'])
# as needed
filter(lambda x: x not in blacklist, some_iterable)
Rationale: Looking through a tuple
takes time proportional to the position of the found item; failure takes the same time as the last item. Looking up an item in a set takes the same time for all items, and for failure. Sets usually win for a large number of items. It all depends on the probability that each item will be searched, and what the probability of failure is. Tuples can win even with a large collection when there's a high probability of a few items (they should be put at the front of the tuple) and a low chance of failure.
方法 4:
you can also use a list comprehension instead of filter.
test = ['aaa','bbb','ccc','ddd','footer','header']
filtered_test = [x for x in test if x not in ('footer', 'header')]
or a generator expression (depending on your needs)
test = ['aaa','bbb','ccc','ddd','footer','header']
filtered_test = (x for x in test if x not in ('footer', 'header'))
(by Phyo Arkar Lwin、Vincent Savard、johnsyweb、John Machin、Corey Goldberg)