如何使用條件將一個數據框列的值與另一個數據框列的值匹配? (How do you match the value of one dataframe's column with another dataframe's column using conditionals?)


問題描述

如何使用條件將一個數據框列的值與另一個數據框列的值匹配? (How do you match the value of one dataframe's column with another dataframe's column using conditionals?)

我有兩個數據框:

Row No.    Subject    
1      Apple
2      Banana
3      Orange
4      Lemon
5      Strawberry


row_number Subjects Special?
1    Banana      Yes
2    Lemon       No
3    Apple       No
4    Orange      No
5    Strawberry  Yes
6    Cranberry   Yes
7    Watermelon  No

我想更改第一個數據框的行號以匹配第二個數據框。應該是這樣的:

Row No.    Subject   
3      Apple
1      Banana
4      Orange
2      Lemon
5      Strawberry

我已經試過這個代碼:

for index, row in df1.iterrows():
    if df1['Subject'] == df2['Subjects']:
        df1['Row No.'] = df2['row_number']

但是我得到了錯誤:

ValueError: Can only compare identically‑labeled Series objects

這是否意味著數據幀必須有相同數量的行和列?它們也必須貼上相同的標籤嗎?有沒有辦法繞過這個限制?

編輯:我找到了一個有前途的替代公式:

for x in df1['Subject']:
    if x in df2['Subjects'].values:
        df2.loc[df2['Subjects'] == x]['row_number'] = df1.loc[df1['Subject'] == x]['Row No.']

但它似乎並沒有像我想要的那樣修改第一個數據幀. 任何提示為什麼?此外,我收到此警告:

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

參考解法

方法 1:

I would avoid using for loops especially when pandas has such great methods to handle these types of problems already.

Using pd.Series.replace

Here is a vectorized way of doing this ‑

  1. d is the dictionary that maps the fruit to the number in second dataframe
  2. You can use df.Subject.replace(d) to now simply replace the keys in the dict d to their values.
  3. Overwrite the Row No. column with this now.
d = dict(zip(df2['Subjects'], df2['row_number']))
df1['Row No.'] = df1.Subject.replace(d)
print(df1)
      Subject  Row No.
0       Apple        3
1      Banana        1
2      Orange        4
3       Lemon        2
4  Strawberry        5

Using pd.merge

Let's try simply merging the 2 dataframe and replace the column completely.

ddf = pd.merge(df1['Subject'], 
               df2[['row_number','Subjects']], 
               left_on='Subject', 
               right_on='Subjects', 
               how='left').drop('Subjects',1)

ddf.columns = df1.columns[::‑1]
print(ddf)
      Subject  Row No.
0       Apple       3
1      Banana       1
2      Orange       4
3       Lemon       2
4  Strawberry       5

方法 2:

Assuming the first is df1 and the second is df2, this should do what you want it to:

import pandas as pd

d1 = {'Row No.': [1, 2, 3, 4, 5], 'Subject': ['Apple', 'Banana', 'Orange', 
     'Lemon', 'Strawberry']}
df1 = pd.DataFrame(data=d1)

d2 = {'row_number': [1, 2, 3, 4, 5, 6, 7], 'Subjects': ['Banana', 'Lemon', 'Apple', 
'Orange', 'Strawberry', 'Cranberry', 'Watermelon'], 'Special?': ['Yes', 'No', 
     'No', 'No', 
     'Yes', 'Yes', 'No']}
df2 = pd.DataFrame(data=d2)

for x in df1['Subject']:
    if x in df2['Subjects'].values:
        df1.loc[df1['Subject'] == x, 'Row No.'] = (df2.loc[df2['Subjects'] == x]['row_number']).item()

#print(df1)
#print(df2)

In your edited answer it looks like you had the dataframes swapped and you were missing the item() to get the actual row_number value and not the Series object.

(by ACanAkshay SehgalKel Varnsen)

參考文件

  1. How do you match the value of one dataframe's column with another dataframe's column using conditionals? (CC BY‑SA 2.5/3.0/4.0)

#conditional-statements #Python #dataframe






相關問題

在 SSRS 中使用條件來提高可見性 (using conditionals in SSRS for visibility)

Smarty - {IF} {/IF} 內的條件太多 (Smarty - Too many conditions inside {IF} {/IF})

awk 如果有多個條件拋出錯誤 (awk if with multiple condition throws error)

正則表達式錯誤,嵌套標籤 (Regex error, nested tags)

警告:分配條件 (Warning: Assignment in condition)

JavaScript 中的條件語句 (Conditional Statement in JavaScript)

與 linus 條件 '-z' '-n' 混淆 (Confuse with the linus conditions '-z' '-n')

如果條件為真,則將表達式添加到循環中 (if condition is true, add an expression to a loop)

為什麼用多態性替換條件有用? (Why is replacing conditionals with polymorphism useful?)

如何使用條件將一個數據框列的值與另一個數據框列的值匹配? (How do you match the value of one dataframe's column with another dataframe's column using conditionals?)

使用另一個數據框的條件創建一個新列 (Create a new column with a condition of another dataframe)

排除具有空值的 Python 列表 (Excluding Python Lists With Empty Values)







留言討論