在 dplyr/purrr 工作流程中動態連接多個數據集 (Dynamically join multiple datasets in a dplyr/purrr workflow)


問題描述

在 dplyr/purrr 工作流程中動態連接多個數據集 (Dynamically join multiple datasets in a dplyr/purrr workflow)

我有兩個包含多個數據框的兩個不同年份的列表:

df_18 <‑ results_2018[[1]] %>%
        select(Answers, Austria)

df_19 <‑ results_2019[[1]] %>%
    select(Answers, Austria)

它們看起來非常相似,如下所示:

structure(list(Answers = c("45 to 54", "25 to 34", "35 to 44", 
"55 to 64", "16 to 24"), Austria = c(23.3, 21.5, 20.8, 15.6, 
18.8)), row.names = c(NA, ‑5L), class = "data.frame")

structure(list(Answers = c("45 to 54", "35 to 44", "25 to 34", 
"16 to 24", "55 to 64"), Austria = c(23.4, 20.7, 21.4, 18.7, 
15.8)), row.names = c(NA, ‑5L), class = "data.frame")

我需要完全加入“答案” " 兩個列表中每個元素的類別。

它應該看起來像這樣,但是對於列表中的每個數據集,結果也應該是一個數據框列表。

這是我的每年一個元素的代碼:

dplyr::full_join(df_18, df_19, by="Answers") %>%
    mutate(Difference = Austria.y ‑ Austria.x) %>%
    rename_at(vars(contains(".x")), ~str_replace(.x, ".x", "_2018")) %>%
    rename_at(vars(contains(".y")), ~str_replace(.x, ".y", "_2019")) %>%
    set_names(c("Answers", "Austria_2018", "Austria_2019", "Difference"))

誰能幫我實現這個目標?

謝謝:)


參考解法

方法 1:

If we are doing this for corresponding elements of the two lists, use map2

library(purrr)
library(dplyr)
library(stringr)
map2(results_2018, results_2019, ~ 
              full_join(.x %>% select(Answers, Austria),
                                 .y %>% select(Answers, Austria),
                         by = "Answers") %>%
                mutate(Difference = Austria.y ‑ Austria.x) %>%
               rename_at(vars(contains(".x")),
                      ~str_replace(., ".x", "_2018")) %>%
               rename_at(vars(contains(".y")),
                     ~str_replace(., ".y", "_2019")) %>%
               set_names(c("Answers", "Austria_2018", "Austria_2019", "Difference")))

(by Data Masteryakrun)

參考文件

  1. Dynamically join multiple datasets in a dplyr/purrr workflow (CC BY‑SA 2.5/3.0/4.0)

#R #dplyr #purrr






相關問題

如何將均值、標準差等函數應用於整個矩陣 (How to apply mean, sd etc. function to a whole matrix)

Tạo các thùng của mỗi hàng trong bảng và vẽ hình thanh ngăn xếp trong R (Make bins of each table row and draw stack bar figure in R)

Reading not quite correct .csv file in R (Reading not quite correct .csv file in R)

包'treemap'中的線條粗細 (Thickness of lines in Package ‘treemap’)

是否需要帶有 awk 的預處理文件,或者可以直接在 R 中完成? (Is preprocessing file with awk needed or it can be done directly in R?)

rpivotTable 選擇元素下拉菜單 (rpivotTable select elements drop down menu)

優化性能 - Shiny 中的大文件輸入 (Optimizing Performance - Large File Input in Shiny)

數值取決於所應用的應用系列,R (Numeric values depending of apply family applied, R)

如何記錄全年的值? (How to note the values across year?)

R中的線性搜索 (Linear search in R)

在 dplyr/purrr 工作流程中動態連接多個數據集 (Dynamically join multiple datasets in a dplyr/purrr workflow)

如何將行值更改為列名 (R) (How change Row values to Column names (R))







留言討論