Tidying the CDP Data - Part 1

My university purchased the CDP database for my academic research.

These posts document the steps to convert data in the CDP spreadsheets in to a tidy format for data analysis. Hopefully, they will be useful to others starting with very messy data in multiple Excel spreadsheets.

Load the R packages

# install.packages('pacman') #uncomment if don't already have
library(pacman) # package manager
p_load(janitor, readxl, tidyverse)

Create variables for emissions years

  • Each year of public CDP responses is in a project year spreadsheet. Put paths to spreadsheets into variable path_in.
  • CDP project year leads the year of emissions by 1 year. Put the years of emissions in the variable years_in
  • CDP project years 2011-2018 report data for years 2010-2017.
path_in <- paste0("~/Dropbox/data/cdp/investor/cdp_", 2011:2018, ".xlsx")
path_in
## [1] "~/Dropbox/data/cdp/investor/cdp_2011.xlsx"
## [2] "~/Dropbox/data/cdp/investor/cdp_2012.xlsx"
## [3] "~/Dropbox/data/cdp/investor/cdp_2013.xlsx"
## [4] "~/Dropbox/data/cdp/investor/cdp_2014.xlsx"
## [5] "~/Dropbox/data/cdp/investor/cdp_2015.xlsx"
## [6] "~/Dropbox/data/cdp/investor/cdp_2016.xlsx"
## [7] "~/Dropbox/data/cdp/investor/cdp_2017.xlsx"
## [8] "~/Dropbox/data/cdp/investor/cdp_2018.xlsx"
year_in <- c(2010, 2011, 2012, 2013, 2014, 2015, 2016, 2017)
year_in
## [1] 2010 2011 2012 2013 2014 2015 2016 2017

Identify the sheets (name or number) in each spreadsheet

For this post we are want the sheet with Summary data.

  • use the map function to loop through all the spreadsheets and list the sheets[^run ?map in the console for help on map function]
  • the output of map is a list
    • [[1]] are sheets in the cdp_2011.xlsx spreadsheet which contains data for 2010
    • [[8]] are sheets in the cdp_2018.xlsx spreadsheet which contains data for 2017
map(path_in, excel_sheets)
## [[1]]
##   [1] "Criteria"           "CRM"                "Summary"           
##   [4] "Introduction"       "0.2"                "0.3"               
##   [7] "Management 1"       "1.2a"               "Management 2"      
##  [10] "Management 3"       "3.1a"               "3.1b"              
##  [13] "3.1c"               "3.1d"               "3.3a"              
##  [16] "3.3b"               "Management 4"       "Risks&Opps 5"      
##  [19] "5.1a"               "5.1c"               "5.1e"              
##  [22] "Risks&Opps 6"       "6.1a"               "6.1c"              
##  [25] "6.1e"               "Emissions 7"        "7.1"               
##  [28] "7.2"                "7.3"                "7.4"               
##  [31] "Emissions 8"        "8.2b"               "8.2c"              
##  [34] "8.2d"               "8.3b"               "8.3c"              
##  [37] "8.3d"               "8.4a"               "8.4a (2)"          
##  [40] "8.5"                "8.6b"               "8.7b"              
##  [43] "Emissions 9"        "9.1a"               "9.2a"              
##  [46] "9.2b"               "9.2c"               "9.2d"              
##  [49] "Emissions 10"       "10.1a"              "10.2a"             
##  [52] "10.2b"              "10.2c"              "Emissions 11"      
##  [55] "11.2a"              "Emissions 12"       "12.2"              
##  [58] "12.3"               "Emissions 13"       "13.1a"             
##  [61] "13.2"               "13.3"               "13.4"              
##  [64] "Emissions 14"       "14.1a"              "14.2a"             
##  [67] "Emissions 15"       "15.1"               "15.2b"             
##  [70] "15.3a"              "Sign Off"           "Oil & Gas OG0"     
##  [73] "Oil & Gas OG1"      "OG1.1"              "OG1.2"             
##  [76] "Oil & Gas OG2"      "OG2.1"              "OG2.3"             
##  [79] "OG2.4"              "Oil & Gas OG3"      "OG3.1"             
##  [82] "OG3.3"              "Oil & Gas OG4"      "OG4.1"             
##  [85] "OG4.3"              "OG4.6"              "OG4.7"             
##  [88] "Oil & Gas OG5"      "OG5.1"              "OG5.2"             
##  [91] "Oil & Gas OG6"      "OG6.1b"             "OG6.1c"            
##  [94] "OG6.1d"             "OG6.1e"             "OG6.1f"            
##  [97] "Electrical EU0"     "Electrical EU1"     "Electrical EU2"    
## [100] "TQ1"                "TQ2"                "TQ3"               
## [103] "TQ4"                "TQ5"                "TQ6"               
## [106] "TQ7"                "TQ8"                "TQ9"               
## [109] "TQ10"               "TQ11"               "TQ12"              
## [112] "Electrical EU3"     "EU3.1a"             "Electrical EU4"    
## [115] "EU4.1"              "EU4.2"              "EU4.3"             
## [118] "Automotive AU0"     "Automotive AU1"     "AU1.1a"            
## [121] "AU1.1b"             "AU1.1c"             "AU1.1d"            
## [124] "AU1.1e"             "AU1.2a"             "AU1.2b"            
## [127] "AU1.2c"             "AU1.3a"             "Automotive AU2"    
## [130] "AU2.3a"             "AU2.3b"             "AU2.3c"            
## [133] "AU2.3d"             "AU2.3e"             "AU2.4a"            
## [136] "AU2.4b"             "AU2.4c"             "Automotive AU3"    
## [139] "AU3.1a"             "AU3.1b"             "AU3.1c"            
## [142] "AU3.1d"             "AU3.1e"             "AU3.1f"            
## [145] "TQ13"               "TQ14"               "TQ15"              
## [148] "TQ16"               "TQ17"               "TQ18"              
## [151] "Management Short 1" "Management Short 2" "Management Short 3"
## [154] "3.1a (2)"           "3.1b (3)"           "3.1d (4)"          
## [157] "3.3a (5)"           "Risks&Opps Short 5" "5.1a (2)"          
## [160] "5.1c (3)"           "5.1e (4)"           "Risks&Opps Short 6"
## [163] "6.1a (2)"           "6.1c (3)"           "6.1e (4)"          
## [166] "Emissions Short 7"  "7.1 (2)"            "7.2 (3)"           
## [169] "Emissions Short 8"  "8.2b (2)"           "8.2c (3)"          
## [172] "8.2d (4)"           "8.3b (5)"           "8.3c (6)"          
## [175] "8.3d (7)"           "8.4a (8)"           "Emissions Short 9" 
## [178] "9.1a (2)"           "Emissions Short 10" "10.1a (2)"         
## [181] "Emissions Short 12" "Emissions Short 13" "13.2 (2)"          
## [184] "Emissions Short 14" "Sheet184"          
## 
## [[2]]
##   [1] "Criteria"       "Doc Control"    "CRM"            "Summary"       
##   [5] "Introduction"   "0.2"            "0.3"            "Management 1"  
##   [9] "1.2a"           "Management 2"   "Management 3"   "3.1a"          
##  [13] "3.1b"           "3.1c"           "3.1d"           "3.3a"          
##  [17] "3.3b"           "3.3c"           "Management 4"   "Risks&Opps 5"  
##  [21] "5.1a"           "5.1c"           "5.1e"           "Risks&Opps 6"  
##  [25] "6.1a"           "6.1c"           "6.1e"           "Emissions 7"   
##  [29] "7.1"            "7.2"            "7.3"            "7.4"           
##  [33] "Emissions 8"    "8.2b"           "8.2c"           "8.2d"          
##  [37] "8.3b"           "8.3c"           "8.3d"           "8.4a"          
##  [41] "8.4a (2)"       "8.5"            "8.6b"           "8.7b"          
##  [45] "Emissions 9"    "9.1a"           "9.2a"           "9.2b"          
##  [49] "9.2c"           "9.2d"           "Emissions 10"   "10.1a"         
##  [53] "10.2a"          "10.2b"          "10.2c"          "Emissions 11"  
##  [57] "11.2a"          "Emissions 12"   "12.2"           "12.3"          
##  [61] "Emissions 13"   "13.1a"          "13.2"           "13.3"          
##  [65] "13.4"           "Emissions 14"   "14.1a"          "14.2a"         
##  [69] "Emissions 15"   "15.1"           "15.2b"          "15.3a"         
##  [73] "Sign Off"       "Electrical EU0" "Electrical EU1" "Electrical EU2"
##  [77] "EU2.1a"         "EU2.1b"         "EU2.1c"         "EU2.1d"        
##  [81] "EU2.1e"         "EU2.1f"         "EU2.1g"         "EU2.1h"        
##  [85] "EU2.1i"         "EU2.1j"         "EU2.1k"         "EU2.1l"        
##  [89] "Electrical EU3" "EU3.1a"         "Electrical EU4" "EU4.1"         
##  [93] "EU4.2"          "EU4.3"          "ICT ICT0"       "ICT1.2"        
##  [97] "ICT1.4a"        "ICT1.4b"        "ICT1.4c"        "ICT1.6"        
## [101] "ICT1.7"         "ICT ICT0 (2)"   "ICT2.2"         "ICT2.3"        
## [105] "ICT2.4"         "ICT ICT0 (3)"   "ICT3.2"         "ICT3.3"        
## [109] "ICT3.4"         "ICT ICT0 (4)"   "ICT4.2"         "ICT4.3"        
## [113] "ICT ICT0 (5)"   "ICT5.2"         "ICT5.3"         "ICT5.4"        
## [117] "ICT ICT0 (6)"   "ICT6.2"         "ICT6.3"         "ICT6.4"        
## [121] "Automotive AU0" "Automotive AU1" "AU1.1a"         "AU1.1b"        
## [125] "AU1.1c"         "AU1.1d"         "AU1.1e"         "AU1.2a"        
## [129] "AU1.2b"         "AU1.2c"         "AU1.3a"         "Automotive AU2"
## [133] "AU2.3a"         "AU2.3b"         "AU2.3c"         "AU2.3d"        
## [137] "AU2.3e"         "AU2.4a"         "AU2.4b"         "AU2.4c"        
## [141] "Automotive AU3" "AU3.1a"         "AU3.1b"         "AU3.1c"        
## [145] "AU3.1d"         "AU3.1e"         "AU3.1f"         "TQ1"           
## [149] "TQ2"            "TQ3"            "TQ4"            "TQ5"           
## [153] "TQ6"            "Oil & Gas OG0"  "Oil & Gas OG1"  "OG1.1"         
## [157] "OG1.2"          "Oil & Gas OG2"  "OG2.1"          "OG2.3"         
## [161] "OG2.4"          "Oil & Gas OG3"  "OG3.1"          "OG3.3"         
## [165] "Oil & Gas OG4"  "OG4.1"          "OG4.3"          "OG4.6"         
## [169] "OG4.7"          "Oil & Gas OG5"  "OG5.1"          "OG5.2"         
## [173] "Oil & Gas OG6"  "OG6.1b"         "OG6.1c"         "OG6.1d"        
## [177] "OG6.1e"         "OG6.1f"        
## 
## [[3]]
##   [1] "Criteria"        "Summary"         "Introduction"    "0.2"            
##   [5] "0.3"             "1"               "1.2a"            "2"              
##   [9] "2.3a"            "2.3c"            "3"               "3.1a"           
##  [13] "3.1b"            "3.1c"            "3.1d"            "3.3a"           
##  [17] "3.3b"            "3.3c"            "4"               "4.1"            
##  [21] "5"               "5.1a"            "5.1c"            "5.1e"           
##  [25] "6"               "6.1a"            "6.1c"            "6.1e"           
##  [29] "7"               "7.1"             "7.2"             "7.3"            
##  [33] "7.4"             "8"               "8.4a"            "8.5"            
##  [37] "8.6b"            "8.6c"            "8.7b"            "9"              
##  [41] "9.1a"            "9.2a"            "9.2b"            "9.2c"           
##  [45] "9.2d"            "9.2e"            "10"              "10.1a"          
##  [49] "10.2a"           "10.2b"           "10.2c"           "10.2d"          
##  [53] "11"              "11.2"            "11.3"            "11.4"           
##  [57] "12"              "12.1a"           "12.2"            "12.3"           
##  [61] "12.4"            "13"              "13.1a"           "13.2a"          
##  [65] "14"              "14.1"            "14.2b"           "14.3a"          
##  [69] "14.4b"           "14.4c"           "OG0"             "OG0.1"          
##  [73] "OG0.2"           "OG0.3"           "OG0.4"           "OG1"            
##  [77] "OG1.2"           "OG1.3"           "OG2"             "OG2.1"          
##  [81] "OG2.3"           "OG2.4"           "OG3"             "OG3.1"          
##  [85] "OG3.3"           "OG4"             "OG4.2"           "OG4.4"          
##  [89] "OG4.7"           "OG4.8"           "OG5"             "OG5.1"          
##  [93] "OG5.2"           "OG5.3"           "OG6"             "OG6.1b"         
##  [97] "OG6.1c"          "OG6.1d"          "OG6.1e"          "OG6.1f"         
## [101] "OG7"             "OG7.1"           "OG7.5a"          "OG7.5c"         
## [105] "OG8"             "OG8.1a"          "OG8.2a"          "OG8.3a"         
## [109] "OG8.4a"          "EU0(Electrical)" "EU0.1"           "EU1(Electrical)"
## [113] "EU1.1"           "EU2(Electrical)" "EU2.1a"          "EU2.1b"         
## [117] "EU2.1c"          "EU2.1d"          "EU2.1e"          "EU2.1f"         
## [121] "EU2.1g"          "EU2.1h"          "EU2.1i"          "EU2.1j"         
## [125] "EU2.1k"          "EU2.1l"          "EU3(Electrical)" "EU3.1a"         
## [129] "EU4(Electrical)" "EU4.1"           "EU4.2"           "EU4.3"          
## [133] "AU0(Automotive)" "AU0.1"           "AU1(Automotive)" "AU1.1a"         
## [137] "AU1.1b"          "AU1.1c"          "AU1.1d"          "AU1.1e"         
## [141] "AU1.2a"          "AU1.2b"          "AU1.2c"          "AU1.3a"         
## [145] "AU2(Automotive)" "AU2.3a"          "AU2.3b"          "AU2.3c"         
## [149] "AU2.3d"          "AU2.3e"          "AU2.4a"          "AU2.4b"         
## [153] "AU2.4c"          "AU3(Automotive)" "AU3.1a"          "AU3.1b"         
## [157] "AU3.1c"          "AU3.1d"          "AU3.1e"          "AU3.1f"         
## [161] "AU3.1gi"         "AU3.1gii"        "AU3.1giii"       "AU3.1giv"       
## [165] "AU3.1gv"         "AU3.1gvi"        "ICT1"            "ICT1.2"         
## [169] "ICT1.3"          "ICT1.4a"         "ICT1.4b"         "ICT1.4c"        
## [173] "ICT1.7"          "ICT1.8a"         "ICT2"            "ICT2.2"         
## [177] "ICT2.3"          "ICT3"            "ICT3.2"          "ICT3.3"         
## [181] "ICT3.4"          "ICT4"            "ICT4.2"          "ICT4.3"         
## [185] "ICT5"            "ICT5.2"          "ICT5.3"          "ICT5.4"         
## [189] "ICT6"            "ICT6.2"          "ICT6.3"          "ICT6.4"         
## [193] "Sign Off"       
## 
## [[4]]
##   [1] "Criteria"                       "Summary"                       
##   [3] "CC0. Introduction"              "CC0.2"                         
##   [5] "CC0.3"                          "CC1. Governance"               
##   [7] "CC1.2a"                         "CC2. Strategy"                 
##   [9] "CC2.1a"                         "CC2.1d"                        
##  [11] "CC2.3a"                         "CC2.3c"                        
##  [13] "CC3. Targets & Initiatives"     "CC3.1a"                        
##  [15] "CC3.1b"                         "CC3.1c"                        
##  [17] "CC3.1d"                         "CC3.3a"                        
##  [19] "CC3.3b"                         "CC3.3c"                        
##  [21] "CC4. Communication"             "CC4.1"                         
##  [23] "CC5. Climate Change Risks"      "CC5.1a"                        
##  [25] "CC5.1b"                         "CC5.1c"                        
##  [27] "CC6. Climate Change Opps"       "CC6.1a"                        
##  [29] "CC6.1b"                         "CC6.1c"                        
##  [31] "CC7. Emissions Methodology"     "CC7.1"                         
##  [33] "CC7.2"                          "CC7.3"                         
##  [35] "CC7.4"                          "CC8. Emissions Data"           
##  [37] "CC8.4a"                         "CC8.5"                         
##  [39] "CC8.6a"                         "CC8.6b"                        
##  [41] "CC8.7a"                         "CC8.8"                         
##  [43] "CC9. S1 Emissions Breakdown"    "CC9.1a"                        
##  [45] "CC9.2a"                         "CC9.2b"                        
##  [47] "CC9.2c"                         "CC9.2d"                        
##  [49] "CC9.2e"                         "CC10. S2 Emissions Breakdown"  
##  [51] "CC10.1a"                        "CC10.2a"                       
##  [53] "CC10.2b"                        "CC10.2c"                       
##  [55] "CC10.2d"                        "CC11. Energy"                  
##  [57] "CC11.2"                         "CC11.3"                        
##  [59] "CC11.4"                         "CC12. Emissions Performance"   
##  [61] "CC12.1a"                        "CC12.2"                        
##  [63] "CC12.3"                         "CC12.4"                        
##  [65] "CC13. Emissions Trading"        "CC13.1a"                       
##  [67] "CC13.2a"                        "CC14. Scope 3 Emissions"       
##  [69] "CC14.1"                         "CC14.2a"                       
##  [71] "CC14.3a"                        "CC14.4b"                       
##  [73] "CC14.4c"                        "CC15. Sign Off"                
##  [75] "CC15.1"                         "OG0. Reference information"    
##  [77] "OG0.1"                          "OG0.2"                         
##  [79] "OG0.3"                          "OG1. Production & reserves"    
##  [81] "OG1.2"                          "OG1.3"                         
##  [83] "OG1.5"                          "OG2. Emissions by VC segment"  
##  [85] "OG2.1"                          "OG2.3"                         
##  [87] "OG2.4"                          "OG3. S1 emissions by category" 
##  [89] "OG3.1"                          "OG3.3"                         
##  [91] "OG4. Transfers & sequestration" "OG4.2"                         
##  [93] "OG4.4"                          "OG4.7"                         
##  [95] "OG4.8"                          "OG5. Sales & intensity"        
##  [97] "OG5.1"                          "OG5.2"                         
##  [99] "OG6. Development strategy"      "OG6.1"                         
## [101] "OG6.2"                          "OG6.3"                         
## [103] "OG7. Methane"                   "OG7.1"                         
## [105] "OG7.5a"                         "OG7.5c"                        
## [107] "OG8. Methane"                   "OG8.1a"                        
## [109] "OG8.2a"                         "OG8.3a"                        
## [111] "OG8.4a"                         "EU0. Reference Dates"          
## [113] "EU0.1"                          "EU1. Global Totals by Year"    
## [115] "EU1.1"                          "EU2. Country Profiles"         
## [117] "EU2.1a"                         "EU2.1b"                        
## [119] "EU2.1c"                         "EU2.1d"                        
## [121] "EU2.1e"                         "EU2.1f"                        
## [123] "EU2.1g"                         "EU2.1h"                        
## [125] "EU2.1i"                         "EU2.1j"                        
## [127] "EU2.1k"                         "EU2.1l"                        
## [129] "EU3. Renewable Electricity"     "EU3.1a"                        
## [131] "EU4. Renewable Electricity"     "EU4.1"                         
## [133] "EU4.2"                          "EU4.3"                         
## [135] "AU0. Reference Dates"           "AU0.1"                         
## [137] "AU1. Sales Volumes"             "AU1.1a"                        
## [139] "AU1.1b"                         "AU1.1c"                        
## [141] "AU1.1d"                         "AU1.1e"                        
## [143] "AU1.2a"                         "AU1.2b"                        
## [145] "AU1.2c"                         "AU1.3a"                        
## [147] "AU2. Emissions"                 "AU2.3a"                        
## [149] "AU2.3b"                         "AU2.3c"                        
## [151] "AU2.3d"                         "AU2.3e"                        
## [153] "AU2.4a"                         "AU2.4b"                        
## [155] "AU2.4c"                         "AU3. Clean Technologies"       
## [157] "AU3.1a"                         "AU3.1b"                        
## [159] "AU3.1c"                         "AU3.1d"                        
## [161] "AU3.1e"                         "AU3.1f"                        
## [163] "AU3.1gi"                        "AU3.1gii"                      
## [165] "AU3.1giii"                      "AU3.1giv"                      
## [167] "AU3.1gv"                        "AU3.1gvi"                      
## [169] "ICT1"                           "ICT1.2"                        
## [171] "ICT1.3"                         "ICT1.4a"                       
## [173] "ICT1.4b"                        "ICT1.4c"                       
## [175] "ICT1.7"                         "ICT1.8a"                       
## [177] "ICT2"                           "ICT2.2"                        
## [179] "ICT2.3"                         "ICT3"                          
## [181] "ICT3.2"                         "ICT3.3"                        
## [183] "ICT3.4"                         "ICT4"                          
## [185] "ICT4.2"                         "ICT4.3"                        
## [187] "ICT5"                           "ICT5.2"                        
## [189] "ICT5.3"                         "ICT5.4"                        
## [191] "ICT6"                           "ICT6.2"                        
## [193] "ICT6.3"                         "ICT6.4"                        
## [195] "FBT1. Agriculture"              "FBT1.3a"                       
## [197] "FBT1.4a"                        "FBT1.4b"                       
## [199] "FBT1.6a"                        "FBT1.6b"                       
## [201] "FBT2. Processing"               "FBT2.3a"                       
## [203] "FBT3. Distribution"             "FBT3.3a"                       
## [205] "FBT4. Consumption"             
## 
## [[5]]
##   [1] "Summary"   "CC0.1"     "CC0.2"     "CC0.3"     "CC1.1"     "CC1.2a"   
##   [7] "CC2.1"     "CC2.1a"    "CC2.1d"    "CC2.3a"    "CC2.3c"    "CC3.1"    
##  [13] "CC3.1a"    "CC3.1b"    "CC3.1c"    "CC3.1d"    "CC3.3a"    "CC3.3b"   
##  [19] "CC3.3c"    "CC4.1"     "CC4.1T"    "CC5.1"     "CC5.1a"    "CC5.1b"   
##  [25] "CC5.1c"    "CC6.1"     "CC6.1a"    "CC6.1b"    "CC6.1c"    "CC7.1"    
##  [31] "CC7.1T"    "CC7.2"     "CC7.3"     "CC7.4"     "CC8.1"     "CC8.4a"   
##  [37] "CC8.5"     "CC8.6a"    "CC8.6b"    "CC8.7a"    "CC8.8"     "CC9.1"    
##  [43] "CC9.1a"    "CC9.2a"    "CC9.2b"    "CC9.2c"    "CC9.2d"    "CC9.2e"   
##  [49] "CC10.1"    "CC10.1a"   "CC10.2a"   "CC10.2b"   "CC10.2c"   "CC10.2d"  
##  [55] "CC11.1"    "CC11.2"    "CC11.3"    "CC11.4"    "CC12.1"    "CC12.1a"  
##  [61] "CC12.2"    "CC12.3"    "CC12.4"    "CC13.1"    "CC13.1a"   "CC13.2a"  
##  [67] "CC14.1"    "CC14.1T"   "CC14.2a"   "CC14.3a"   "CC14.4b"   "CC14.4c"  
##  [73] "CC15.1"    "CC15.1T"   "OG0.1"     "OG1.1"     "OG1.2"     "OG1.3"    
##  [79] "OG1.5"     "OG2.1"     "OG2.1T"    "OG2.3"     "OG2.4"     "OG3.1"    
##  [85] "OG3.1T"    "OG3.3"     "OG4.1"     "OG4.2"     "OG4.4"     "OG4.7"    
##  [91] "OG4.8"     "OG5.1"     "OG5.1T"    "OG5.2"     "OG6.1"     "OG6.1T"   
##  [97] "OG6.2"     "OG6.3"     "OG7.1"     "OG7.1T"    "OG7.3"     "OG7.3b"   
## [103] "OG7.5a"    "OG7.6a"    "OG7.7a"    "OG7.8a"    "EU0.1"     "EU0.1T"   
## [109] "EU1.1"     "EU1.1T"    "EU2.1"     "EU2.1a"    "EU2.1b"    "EU2.1c"   
## [115] "EU2.1d"    "EU2.1e"    "EU2.1f"    "EU2.1g"    "EU2.1h"    "EU2.1i"   
## [121] "EU2.1j"    "EU2.1k"    "EU2.1l"    "EU3.1"     "EU3.1a"    "EU4.1"    
## [127] "EU4.1T"    "EU4.2"     "EU4.3"     "AU0.1"     "AU0.1T"    "AU1.1a"   
## [133] "AU1.1aT"   "AU1.1b"    "AU1.1c"    "AU1.1d"    "AU1.1e"    "AU1.2a"   
## [139] "AU1.2b"    "AU1.2c"    "AU1.3a"    "AU2.1"     "AU2.3a"    "AU2.3b"   
## [145] "AU2.3c"    "AU2.3d"    "AU2.3e"    "AU2.4a"    "AU2.4b"    "AU2.4c"   
## [151] "AU3.1a"    "AU3.1aT"   "AU3.1b"    "AU3.1c"    "AU3.1d"    "AU3.1e"   
## [157] "AU3.1f"    "AU3.1gi"   "AU3.1gii"  "AU3.1giii" "AU3.1giv"  "AU3.1gv"  
## [163] "AU3.1gvi"  "ICT0.1a"   "ICT1.2"    "ICT1.3"    "ICT1.4a"   "ICT1.4b"  
## [169] "ICT1.4c"   "ICT1.7"    "ICT1.8a"   "ICT0.1b"   "ICT2.2"    "ICT2.3"   
## [175] "ICT0.1c"   "ICT3.2"    "ICT3.3"    "ICT3.4"    "ICT0.1d"   "ICT4.2"   
## [181] "ICT4.3"    "ICT0.1e"   "ICT5.2"    "ICT5.3"    "ICT5.4"    "ICT0.1f"  
## [187] "ICT6.2"    "ICT6.3"    "ICT6.4"    "FBT1.1"    "FBT1.3b"   "FBT1.3c"  
## [193] "FBT1.4a"   "FBT1.4b"   "FBT1.5a"   "FBT1.6a"   "FBT1.7a"   "FBT1.7b"  
## [199] "FBT2.1"    "FBT2.3a"   "FBT3.1"    "FBT3.3a"   "FBT4.1"   
## 
## [[6]]
##   [1] "Criteria"  "Summary"   "CC0.1"     "CC0.2"     "CC0.3"     "CC1.1"    
##   [7] "CC1.2a"    "CC2.1"     "CC2.1a"    "CC2.1d"    "CC2.3a"    "CC2.3c"   
##  [13] "CC3.1"     "CC3.1a"    "CC3.1b"    "CC3.1c"    "CC3.1d"    "CC3.1e"   
##  [19] "CC3.2a"    "CC3.3a"    "CC3.3b"    "CC3.3c"    "CC4.1"     "CC4.1T"   
##  [25] "CC5.1"     "CC5.1a"    "CC5.1b"    "CC5.1c"    "CC6.1"     "CC6.1a"   
##  [31] "CC6.1b"    "CC6.1c"    "CC7.1"     "CC7.1T"    "CC7.2"     "CC7.3"    
##  [37] "CC7.4"     "CC8.1"     "CC8.3a"    "CC8.4a"    "CC8.5"     "CC8.6a"   
##  [43] "CC8.6b"    "CC8.7a"    "CC8.8"     "CC9.1"     "CC9.1a"    "CC9.2a"   
##  [49] "CC9.2b"    "CC9.2c"    "CC9.2d"    "CC10.1"    "CC10.1a"   "CC10.2a"  
##  [55] "CC10.2b"   "CC10.2c"   "CC11.1"    "CC11.2"    "CC11.3a"   "CC11.4"   
##  [61] "CC11.5"    "CC12.1"    "CC12.1a"   "CC12.2"    "CC12.3"    "CC13.1"   
##  [67] "CC13.1a"   "CC13.2a"   "CC14.1"    "CC14.1T"   "CC14.2a"   "CC14.3a"  
##  [73] "CC14.4b"   "CC14.4c"   "CC15.1"    "CC15.1T"   "OG0.1"     "OG1.1"    
##  [79] "OG1.2"     "OG1.3"     "OG1.5"     "OG2.1"     "OG2.1T"    "OG2.3"    
##  [85] "OG2.4"     "OG3.1"     "OG3.1T"    "OG3.3"     "OG4.1"     "OG4.2"    
##  [91] "OG4.4"     "OG4.7"     "OG4.8"     "OG5.1"     "OG5.1T"    "OG5.2"    
##  [97] "OG6.1"     "OG6.1T"    "OG6.2"     "OG6.3"     "OG7.1"     "OG7.1T"   
## [103] "OG7.4"     "OG7.5"     "EU0.1"     "EU0.1T"    "EU1.1"     "EU1.1T"   
## [109] "EU2.1"     "EU2.1a"    "EU2.1b"    "EU2.1c"    "EU2.1d"    "EU2.1e"   
## [115] "EU2.1f"    "EU2.1g"    "EU2.1h"    "EU2.1i"    "EU2.1j"    "EU2.1k"   
## [121] "EU2.1l"    "EU3.1"     "EU3.1a"    "EU4.1"     "EU4.1T"    "EU4.2"    
## [127] "EU4.3"     "AU0.1"     "AU0.1T"    "AU1.1a"    "AU1.1aT"   "AU1.1b"   
## [133] "AU1.1c"    "AU1.1d"    "AU1.1e"    "AU1.1f"    "AU1.1g"    "AU1.1h"   
## [139] "AU1.1i"    "AU1.2a"    "AU1.2b"    "AU1.2c"    "AU1.2d"    "AU1.2e"   
## [145] "AU1.2f"    "AU1.2g"    "AU1.2h"    "AU1.3a"    "AU1.3b"    "AU1.3c"   
## [151] "AU2.1"     "AU2.3a"    "AU2.3b"    "AU2.3c"    "AU3.1a"    "AU3.1aT"  
## [157] "AU3.1b"    "AU3.1c"    "AU3.1d"    "AU3.1e"    "AU3.1f"    "AU3.1gi"  
## [163] "AU3.1gii"  "AU3.1giii" "AU3.1giv"  "AU3.1gv"   "AU3.1gvi"  "AU3.2"    
## [169] "AU3.3"     "AU3.4"     "ICT0.1a"   "ICT1.2"    "ICT1.3"    "ICT1.4a"  
## [175] "ICT1.4b"   "ICT1.4c"   "ICT1.7"    "ICT1.8a"   "ICT0.1b"   "ICT2.2"   
## [181] "ICT2.3"    "ICT0.1c"   "ICT3.2"    "ICT3.3"    "ICT3.4"    "ICT0.1d"  
## [187] "ICT4.2"    "ICT4.3"    "ICT0.1e"   "ICT5.2"    "ICT5.3"    "ICT5.4"   
## [193] "ICT0.1f"   "ICT6.2"    "ICT6.3"    "ICT6.4"    "FBT1.1"    "FBT1.3b"  
## [199] "FBT1.3c"   "FBT1.4a"   "FBT1.4b"   "FBT1.5a"   "FBT1.6a"   "FBT1.7a"  
## [205] "FBT1.7b"   "FBT2.1"    "FBT2.3a"   "FBT3.1"    "FBT3.3a"   "FBT4.1"   
## [211] "Sheet1"   
## 
## [[7]]
##   [1] "Summary Data"                    "AU0.1"                          
##   [3] "AU1. Sales Volumes"              "AU1.1a"                         
##   [5] "AU1.1b"                          "AU1.1c"                         
##   [7] "AU1.1d"                          "AU1.1e"                         
##   [9] "AU1.1f"                          "AU1.1g"                         
##  [11] "AU1.1h"                          "AU1.1i"                         
##  [13] "AU1.2a"                          "AU1.2b"                         
##  [15] "AU1.2c"                          "AU1.2d"                         
##  [17] "AU1.2e"                          "AU1.2f"                         
##  [19] "AU1.2g"                          "AU1.2h"                         
##  [21] "AU1.3a"                          "AU1.3b"                         
##  [23] "AU1.3c"                          "AU2. Regulatory Compliance"     
##  [25] "AU2.3a"                          "AU2.3b"                         
##  [27] "AU2.3c"                          "AU3. Clean Technologies"        
##  [29] "AU3.1a"                          "AU3.1b"                         
##  [31] "AU3.1c"                          "AU3.1d"                         
##  [33] "AU3.1e"                          "AU3.1f"                         
##  [35] "AU3.1gi"                         "AU3.1gii"                       
##  [37] "AU3.1giii"                       "AU3.1giv"                       
##  [39] "AU3.1gv"                         "AU3.1gvi"                       
##  [41] "AU3.2"                           "AU3.3"                          
##  [43] "AU3.4"                           "CC0. Introduction"              
##  [45] "CC0.2"                           "CC0.3"                          
##  [47] "CC1. Governance"                 "CC1.2a"                         
##  [49] "CC2. Strategy"                   "CC2.1a"                         
##  [51] "CC2.1d"                          "CC2.3a"                         
##  [53] "CC2.3c"                          "CC3. Targets and Initiatives"   
##  [55] "CC3.1a"                          "CC3.1b"                         
##  [57] "CC3.1c"                          "CC3.1d"                         
##  [59] "CC3.1e"                          "CC3.2a"                         
##  [61] "CC3.3a"                          "CC3.3b"                         
##  [63] "CC3.3c"                          "CC4.1"                          
##  [65] "CC5. Climate Change Risks"       "CC5.1a"                         
##  [67] "CC5.1b"                          "CC5.1c"                         
##  [69] "CC6. Climate Change Opportuniti" "CC6.1a"                         
##  [71] "CC6.1b"                          "CC6.1c"                         
##  [73] "CC7. Emissions Methodology"      "CC7.1"                          
##  [75] "CC7.2"                           "CC7.3"                          
##  [77] "CC7.4"                           "CC8. Emissions Data"            
##  [79] "CC8.3"                           "CC8.3a"                         
##  [81] "CC8.4a"                          "CC8.5"                          
##  [83] "CC8.6a"                          "CC8.6b"                         
##  [85] "CC8.7a"                          "CC8.8"                          
##  [87] "CC9. Scope 1 Emissions Breakdow" "CC9.1a"                         
##  [89] "CC9.2a"                          "CC9.2b"                         
##  [91] "CC9.2c"                          "CC9.2d"                         
##  [93] "CC10. Scope 2 Emissions Breakdo" "CC10.1a"                        
##  [95] "CC10.2a"                         "CC10.2b"                        
##  [97] "CC10.2c"                         "CC11. Energy"                   
##  [99] "CC11.2"                          "CC11.3a"                        
## [101] "CC11.4"                          "CC11.5"                         
## [103] "CC12. Emissions Performance"     "CC12.1a"                        
## [105] "CC12.2"                          "CC12.3"                         
## [107] "CC13. Emissions Trading"         "CC13.1a"                        
## [109] "CC13.2a"                         "CC14. Scope 3 Emissions"        
## [111] "CC14.1"                          "CC14.2a"                        
## [113] "CC14.3a"                         "CC14.4b"                        
## [115] "CC15.1"                          "EU0.1"                          
## [117] "EU1.1"                           "EU2. Individual Country Profile"
## [119] "EU2.1a"                          "EU2.1b"                         
## [121] "EU2.1c"                          "EU2.1d"                         
## [123] "EU2.1e"                          "EU2.1f"                         
## [125] "EU2.1g"                          "EU2.1h"                         
## [127] "EU2.1i"                          "EU2.1j"                         
## [129] "EU2.1k"                          "EU2.1l"                         
## [131] "EU3. Renewable Electricity Sour" "EU3.1a"                         
## [133] "EU4.1"                           "EU4.2"                          
## [135] "EU4.3"                           "FBT1. Agriculture"              
## [137] "FBT1.2a"                         "FBT1.3a"                        
## [139] "FBT1.3b"                         "FBT1.3c"                        
## [141] "FBT1.4a"                         "FBT1.4b"                        
## [143] "FBT1.5a"                         "FBT1.6a"                        
## [145] "FBT1.7a"                         "FBT1.7b"                        
## [147] "FBT2. Processing"                "FBT2.3a"                        
## [149] "FBT3. Distribution"              "FBT3.3a"                        
## [151] "FBT4. Consumption"               "ICT1. Data center activities"   
## [153] "ICT1.2"                          "ICT1.3"                         
## [155] "ICT1.4a"                         "ICT1.4b"                        
## [157] "ICT1.4c"                         "ICT1.7"                         
## [159] "ICT1.8a"                         "ICT2. Provision of networkconne"
## [161] "ICT2.2"                          "ICT2.3"                         
## [163] "ICT3. Manufacture or assembly o" "ICT3.2"                         
## [165] "ICT3.3"                          "ICT3.4"                         
## [167] "ICT4. Manufacture of software"   "ICT4.2"                         
## [169] "ICT4.3"                          "ICT5. Business services (office"
## [171] "ICT5.2"                          "ICT5.3"                         
## [173] "ICT5.4"                          "ICT6. Other activities"         
## [175] "ICT6.2"                          "ICT6.3"                         
## [177] "ICT6.4"                          "OG0. Reference information"     
## [179] "OG1. Production, reserves and s" "OG1.1"                          
## [181] "OG1.2"                           "OG1.3"                          
## [183] "OG1.5"                           "OG1.6"                          
## [185] "OG2. Emissions by segment in th" "OG2.1"                          
## [187] "OG2.3"                           "OG2.4"                          
## [189] "OG3. Scope 1 emissions by emiss" "OG3.1"                          
## [191] "OG3.3"                           "OG4. Transfers & sequestration "
## [193] "OG4.2"                           "OG4.4"                          
## [195] "OG4.7"                           "OG4.8"                          
## [197] "OG5. Emissions intensity"        "OG5.1"                          
## [199] "OG6.1"                           "OG6.2"                          
## [201] "OG6.3"                           "OG7. Methane from the natural g"
## [203] "OG7.1"                           "OG7.4"                          
## [205] "OG7.5"                          
## 
## [[8]]
##   [1] "Summary Data"                    "C0 - Introduction"              
##   [3] "C0.2"                            "C0.3"                           
##   [5] "C1 - Governance"                 "C1.1a"                          
##   [7] "C1.1b"                           "C1.1c"                          
##   [9] "C1.2"                            "C1.3a"                          
##  [11] "C2 - Risks and Opportunities"    "C2.1"                           
##  [13] "C2.2a"                           "C2.2c"                          
##  [15] "C2.2e"                           "C2.3a"                          
##  [17] "C2.3b"                           "C2.4a"                          
##  [19] "C2.4b"                           "C2.5"                           
##  [21] "C2.6"                            "C3 - Business strategy"         
##  [23] "C3.1d"                           "C4 - Targets and Performance"   
##  [25] "C4.1a"                           "C4.1b"                          
##  [27] "C4.1c"                           "C4.2"                           
##  [29] "C4.3a"                           "C4.3b"                          
##  [31] "C4.3c"                           "C4.5a"                          
##  [33] "C5 - Emissions methodology"      "C5.1"                           
##  [35] "C6 - Emissions data"             "C6.1"                           
##  [37] "C6.10"                           "C6.2"                           
##  [39] "C6.3"                            "C6.4a"                          
##  [41] "C6.5"                            "C7 - Emissions breakdown"       
##  [43] "C7.1a"                           "C7.2"                           
##  [45] "C7.3a"                           "C7.3b"                          
##  [47] "C7.3c"                           "C7.5"                           
##  [49] "C7.6a"                           "C7.6b"                          
##  [51] "C7.6c"                           "C7.9a"                          
##  [53] "C8 - Energy"                     "C8.2"                           
##  [55] "C8.2a"                           "C8.2b"                          
##  [57] "C8.2c"                           "C8.2d"                          
##  [59] "C8.2e"                           "C8.2f"                          
##  [61] "C9 - Additional metrics"         "C10 - Verification"             
##  [63] "C10.1a"                          "C10.1b"                         
##  [65] "C10.2"                           "C10.2a"                         
##  [67] "C11 - Carbon pricing"            "C11.1b"                         
##  [69] "C11.1c"                          "C11.2a"                         
##  [71] "C11.3a"                          "C12 - Engagement"               
##  [73] "C12.1a"                          "C12.1b"                         
##  [75] "C12.3a"                          "C12.3c"                         
##  [77] "C12.4"                           "C14 - Signoff"                  
##  [79] "C-AC0.6a_C-FB0.6a_C-PF0.6a"      "C-AC0.6b_C-FB0.6b_C-PF0.6b"     
##  [81] "C-AC0.6c_C-FB0.6c_C-PF0.6c"      "C-AC0.6_C-FB0.6_C-PF0.6"        
##  [83] "C-AC0.6d_C-FB0.6d_C-PF0.6d"      "C-AC0.6e_C-FB0.6e_C-PF0.6e"     
##  [85] "C-AC0.6f_C-FB0.6f_C-PF0.6f"      "C-AC0.6g_C-FB0.6g_C-PF0.6g"     
##  [87] "C-AC0.7_C-FB0.7_C-PF0.7"         "C-AC3.1b_C-CE3.1b_C-CH3.1b_C-CO"
##  [89] "C-AC3.1e_C-CE3.1e_C-CH3.1e_C-CO" "C-AC4.4a_C-FB4.4a_C-PF4.4a"     
##  [91] "C-AC4.4_C-FB4.4_C-PF4.4"         "C-AC6.6a_C-FB6.6a_C-PF6.6a"     
##  [93] "C-AC6.6b_C-FB6.6b_C-PF6.6b"      "C-AC6.6_C-FB6.6_C-PF6.6"        
##  [95] "C-AC6.8a_C-FB6.8a_C-PF6.8a"      "C-AC6.8_C-FB6.8_C-PF6.8"        
##  [97] "C-AC6.9a_C-FB6.9a_C-PF6.9a"      "C-AC6.9_C-FB6.9_C-PF6.9"        
##  [99] "C-AC7.4a_C-FB7.4a_C-PF7.4a"      "C-AC7.4b_C-FB7.4b_C-PF7.4b"     
## [101] "C-AC7.4c_C-FB7.4c_C-PF7.4c"      "C-AC7.4_C-FB7.4_C-PF7.4"        
## [103] "C-AC12.2a_C-FB12.2a_C-PF12.2a"   "C-AC12.2b_C-FB12.2b_C-PF12.2b"  
## [105] "C-AC12.2c_C-FB12.2c_C-PF12.2c"   "C-AC12.2_C-FB12.2_C-PF12.2"     
## [107] "C-AC13.1a_C-FB13.1a_C-PF13.1a"   "C-AC13.1_C-FB13.1_C-PF13.1"     
## [109] "C-AC13.2a_C-FB13.2a_C-PF13.2a"   "C-AC13.2_C-FB13.2_C-PF13.2"     
## [111] "C-CE0.7"                         "C-CE4.9"                        
## [113] "C-CE6.11"                        "C-CE7.4_C-CH7.4_C-CO7.4_C-EU7.4"
## [115] "C-CE7.7_C-CH7.7_C-CO7.7_C-MM7.7" "C-CE8.2a"                       
## [117] "C-CE8.2c"                        "C-CE8.2e"                       
## [119] "C-CE9.3a"                        "C-CE9.3b"                       
## [121] "C-CE9.3c"                        "C-CE9.6"                        
## [123] "C-CH0.7"                         "C-CH7.8"                        
## [125] "C-CH7.8a"                        "C-CH8.2a"                       
## [127] "C-CH8.2e"                        "C-CH8.3"                        
## [129] "C-CH8.3a"                        "C-CH9.3a"                       
## [131] "C-CH9.6"                         "C-CO0.7"                        
## [133] "C-CO4.2a"                        "C-CO4.6"                        
## [135] "C-CO4.7"                         "C-CO4.7a"                       
## [137] "C-CO4.7b"                        "C-CO4.8"                        
## [139] "C-CO7.1b"                        "C-CO9.2a"                       
## [141] "C-CO9.2b"                        "C-CO9.3a"                       
## [143] "C-CO9.3b"                        "C-CO9.4a"                       
## [145] "C-CO9.6_C-EU9.6_C-OG9.6"         "C-EU0.7"                        
## [147] "C-EU4.6"                         "C-EU7.1b"                       
## [149] "C-EU8.2e"                        "C-EU8.4"                        
## [151] "C-EU8.4a"                        "C-EU9.5a"                       
## [153] "C-EU9.5b"                        "C-FI"                           
## [155] "C-MM0.7"                         "C-MM8.2a"                       
## [157] "C-MM8.2e"                        "C-MM9.3a"                       
## [159] "C-MM9.3b"                        "C-MM9.6"                        
## [161] "C-OG0.7"                         "C-OG4.2a"                       
## [163] "C-OG4.6"                         "C-OG4.7"                        
## [165] "C-OG4.7a"                        "C-OG4.7b"                       
## [167] "C-OG4.8"                         "C-OG6.12"                       
## [169] "C-OG6.13"                        "C-OG7.1b"                       
## [171] "C-OG9.2a"                        "C-OG9.2b"                       
## [173] "C-OG9.2c"                        "C-OG9.2d"                       
## [175] "C-OG9.2e"                        "C-OG9.3a"                       
## [177] "C-OG9.3b"                        "C-OG9.3c"                       
## [179] "C-OG9.3d"                        "C-OG9.3e"                       
## [181] "C-OG9.7"                         "C-OG9.8"                        
## [183] "C-OG9.8a"                        "C-OG9.8b"                       
## [185] "C-OG9.8c"                        "C-ST0.7"                        
## [187] "C-ST4.9"                         "C-ST6.14"                       
## [189] "C-ST8.2a"                        "C-ST8.2e"                       
## [191] "C-ST8.3"                         "C-ST9.3a"                       
## [193] "C-ST9.3b"                        "C-ST9.6"                        
## [195] "C-TO0.7_C-TS0.7"                 "C-TO7.8"                        
## [197] "C-TO8.4"                         "C-TO9.3_C-TS9.3"                
## [199] "C-TO9.6_C-TS9.6"                 "C-TS6.15"                       
## [201] "C-TS8.2h"                        "C-TS8.4"

Use output from map to create variable sheet_in

sheet_in contains the the sheet names

  • years 2010-2016 sheet is named Summary
  • years 2010-2016 sheet is named Summary Data
sheet_in <- c(rep("Summary", 6), rep("Summary Data", 2)) # identify by name
# sheet_in  <- c(3, 4, 2, 2, 1, 2, 1, 1) # OR identify by sheet position
sheet_in
## [1] "Summary"      "Summary"      "Summary"      "Summary"      "Summary"     
## [6] "Summary"      "Summary Data" "Summary Data"

Use map2 to determine columns on sheets of sheet_in

Create function find_col to read summary sheet (sheet_in) on each excel spreadsheets (path_in)

For each sheet

  • read sheet in read_excel
  • use function from package janitor to clean_names

Put output in c_summary

find_col <- function(path_in, sheet_in) {
  read_excel(path = path_in, sheet = sheet_in) %>% clean_names()
}

c_summary <- map2(path_in, sheet_in, find_col)
c_summary
## [[1]]
## # A tibble: 1,556 x 14
##    organisation account_no discloser_id country gics_sector gics_industry_g…
##    <chr>        <chr>      <chr>        <chr>   <chr>       <chr>           
##  1 Bharat Petr… 1800       dccd5f55-18… India   Energy      Energy          
##  2 EDP - Energ… 5300       a3cf9f2e-19… Portug… Utilities   Utilities       
##  3 Ernst & You… 29900      d4e6de3d-19… United… Banks       Diversified Fin…
##  4 Kizoo AG     10200      575a3dc1-19… Germany Informatio… Software & Serv…
##  5 Kungsleden … 22300      44af35c7-19… Sweden  Banks       Real Estate     
##  6 Nordea Bank  13400      b932a917-1a… Sweden  Banks       Diversified Fin…
##  7 QLogic Corp. 15400      2fb28347-1a… USA     Informatio… Technology Hard…
##  8 Railway Pro… 32500      eeac8193-1e… Ireland N/A         N/A             
##  9 Royal Wessa… 23100      3ac117f5-1a… Nether… Consumer S… Food, Beverage …
## 10 Sumitomo Me… 17900      5c8d636c-0b… Japan   Materials   Materials       
## # … with 1,546 more rows, and 8 more variables: gics_industry <chr>,
## #   gics_sub_industry <chr>, access <chr>, primary_expansion <chr>,
## #   secondary_expansions <chr>, complexity <chr>, status <chr>, documents <lgl>
## 
## [[2]]
## # A tibble: 1,730 x 14
##    organisation account_no discloser_id country gics_sector gics_industry_g…
##    <chr>        <chr>      <chr>        <chr>   <chr>       <chr>           
##  1 Bel Fuse In… 35800      c8e5c0ab-e9… USA     Informatio… Technology Hard…
##  2 EDP - Energ… 5300       745d83c3-ec… Portug… Utilities   Utilities       
##  3 Elisa Oyj    5400       765e83c3-ec… Finland Telecommun… Telecommunicati…
##  4 Ernst & You… 29900      65e973cf-ec… United… Financials  Diversified Fin…
##  5 Kungsleden … 22300      b380220e-ee… Sweden  Financials  Real Estate     
##  6 Nordea Bank  13400      1cc368a0-ee… Sweden  Financials  Diversified Fin…
##  7 QLogic Corp. 15400      f287bd17-ef… USA     Informatio… Technology Hard…
##  8 Railway Pro… 32500      8488bd17-ef… Ireland Industrials Transportation  
##  9 Royal Wessa… 23100      c49dbd2f-ef… Nether… Consumer S… Food, Beverage …
## 10 Technicolor… 19100      7b653ffd-ef… France  Consumer D… Consumer Durabl…
## # … with 1,720 more rows, and 8 more variables: gics_industry <chr>,
## #   gics_sub_industry <chr>, access <chr>, primary_expansion <chr>,
## #   secondary_expansions <chr>, complexity <chr>, status <chr>, documents <lgl>
## 
## [[3]]
## # A tibble: 1,836 x 18
##    organisation account_no country gics_sector gics_industry_g… gics_industry
##    <chr>        <chr>      <chr>   <chr>       <chr>            <chr>        
##  1 CASELLA      2840       USA     Industrials Commercial & Pr… Commercial S…
##  2 Accenture    21318      Ireland Informatio… Software & Serv… IT Services  
##  3 Ace Ltd.     21320      Switze… Financials  Insurance        Insurance    
##  4 Advantech C… 21330      Taiwan  Informatio… Technology Hard… Computers & …
##  5 Aga Rangema… 21335      United… Consumer D… Consumer Durabl… Household Du…
##  6 Koninklijke… 21338      Nether… Consumer S… Food & Staples … Food & Stapl…
##  7 Air Partner… 21340      United… Industrials Transportation   Airlines     
##  8 AKENERJİ EL… 21112      Turkey  Utilities   Utilities        Independent …
##  9 Alps Electr… 21365      Japan   Industrials Capital Goods    Electrical E…
## 10 adidas AG    21380      Germany Consumer D… Consumer Durabl… Textiles, Ap…
## # … with 1,826 more rows, and 12 more variables: gics_sub_industry <chr>,
## #   primary_expansion <lgl>, secondary_expansions <chr>, ticker <chr>,
## #   isin <chr>, complexity <chr>, status <chr>, submitted_date <chr>,
## #   public <chr>, response_account_no <chr>, response_organisation <chr>,
## #   version <chr>
## 
## [[4]]
## # A tibble: 1,825 x 18
##    organisation account_no country gics_sector gics_industry_g… gics_industry
##    <chr>        <chr>      <chr>   <chr>       <chr>            <chr>        
##  1 Volkswagen … 20309      Germany Consumer D… Automobiles & C… Automobiles  
##  2 Vontobel Ho… 20280      Switze… Financials  Diversified Fin… Capital Mark…
##  3 Vopak        22287      Nether… Energy      Energy           Oil, Gas & C…
##  4 VP Bank Gru… 20338      Liecht… Financials  Banks            Commercial B…
##  5 VWR Interna… 20344      USA     Consumer D… Retailing        Distributors 
##  6 W.W. Graing… 22861      USA     Industrials Capital Goods    Trading Comp…
##  7 Wacker Chem… 20363      Germany Materials   Materials        Chemicals    
##  8 Wal Mart de… 20332      Mexico  Consumer S… Food & Staples … Food & Stapl…
##  9 Walgreen Co… 20398      USA     Consumer S… Food & Staples … Food & Stapl…
## 10 Wal-Mart St… 20402      USA     Consumer S… Food & Staples … Food & Stapl…
## # … with 1,815 more rows, and 12 more variables: gics_sub_industry <chr>,
## #   primary_expansion <lgl>, secondary_expansions <chr>, ticker <chr>,
## #   isin <chr>, complexity <chr>, status <chr>, submitted_date <chr>,
## #   public <chr>, response_account_no <chr>, response_organisation <chr>,
## #   version <chr>
## 
## [[5]]
## # A tibble: 1,838 x 19
##    organisation account_no country gics_sector gics_industry_g… gics_industry
##    <chr>             <dbl> <chr>   <chr>       <chr>            <chr>        
##  1 3M Company          285 USA     Industrials Capital Goods    Industrial C…
##  2 A & J Muckl…      22750 United… Financials  Real Estate      Real Estate …
##  3 A.G. Barr P…       1535 United… Consumer S… Food, Beverage … Beverages    
##  4 A.P. Moller…        276 Denmark Industrials Transportation   Marine       
##  5 A2A                  87 Italy   Utilities   Utilities        Multi-Utilit…
##  6 Abbott Labo…         64 USA     Health Care Health Care Equ… Health Care …
##  7 AbbVie Inc        45103 USA     Health Care Pharmaceuticals… Pharmaceutic…
##  8 Abengoa           32498 Spain   Industrials Capital Goods    Construction…
##  9 Abercrombie…        291 USA     Consumer D… Retailing        Specialty Re…
## 10 Aberdeen As…         78 United… Financials  Diversified Fin… Capital Mark…
## # … with 1,828 more rows, and 13 more variables: gics_sub_industry <chr>,
## #   primary_expansion <lgl>, secondary_expansions <chr>, ticker <chr>,
## #   isin <chr>, complexity <chr>, status <chr>, submitted_date <chr>,
## #   ammended_date <chr>, public <chr>, response_account_no <chr>,
## #   response_organisation <dbl>, version <chr>
## 
## [[6]]
## # A tibble: 1,959 x 16
##    organisation account_no country industry_group primary_expansi…
##    <chr>        <chr>      <chr>   <chr>          <lgl>           
##  1 Shangri-La … 16789      Hong K… Hotels, Resta… NA              
##  2 Shanks Group 16803      United… Trading Compa… NA              
##  3 Sherwin-Wil… 16842      USA     Chemicals      NA              
##  4 Shimizu Cor… 16837      Japan   Construction … NA              
##  5 Shin-Etsu C… 16839      Japan   Chemicals      NA              
##  6 Shinhan Fin… 16891      South … Banks, Divers… NA              
##  7 Shinjin Tex… 58261      South … Textiles, App… NA              
##  8 Shire        22797      Ireland Pharmaceutica… NA              
##  9 Shiseido Co… 16852      Japan   Consumer Dura… NA              
## 10 Shree Cement 16912      India   Construction … NA              
## # … with 1,949 more rows, and 11 more variables: secondary_expansions <chr>,
## #   ticker <chr>, isin <chr>, complexity <chr>, status <chr>,
## #   submitted_date <chr>, ammended_date <chr>, public <chr>,
## #   response_account_no <chr>, response_organisation <chr>, version <chr>
## 
## [[7]]
## # A tibble: 2,003 x 13
##    program_name project_year account_id account_name region incorporated_co…
##    <chr>               <dbl>      <dbl> <chr>        <chr>  <chr>           
##  1 Climate Cha…         2017         44 3i Group     Unite… United Kingdom  
##  2 Climate Cha…         2017        285 3M Company   USA    USA             
##  3 Climate Cha…         2017      59909 A Schulman … USA    USA             
##  4 Climate Cha…         2017       1535 A.G. Barr P… Unite… United Kingdom  
##  5 Climate Cha…         2017        276 A.P. Moller… Nordi… Denmark         
##  6 Climate Cha…         2017         87 A2A          South… Italy           
##  7 Climate Cha…         2017         94 Aalberts In… Benel… Netherlands     
##  8 Climate Cha…         2017         97 Aareal Bank… DACH   Germany         
##  9 Climate Cha…         2017        282 ABB          DACH   Switzerland     
## 10 Climate Cha…         2017         64 Abbott Labo… USA    USA             
## # … with 1,993 more rows, and 7 more variables: access <chr>,
## #   response_status <chr>, gri <chr>, secondary_expansion <chr>, ticker <chr>,
## #   isin <chr>, submitted_date <chr>
## 
## [[8]]
## # A tibble: 1,804 x 18
##    country organization account_number invitation_stat… region public samples
##    <chr>   <chr>                 <dbl> <chr>            <lgl>  <chr>  <chr>  
##  1 Argent… Banco de Ga…           1408 Submitted        NA     public Self s…
##  2 Austra… AGL Energy              351 Submitted        NA     public Austra…
##  3 Austra… Altium                58682 Submitted        NA     public Austra…
##  4 Austra… Alumina                 609 Submitted        NA     public Austra…
##  5 Austra… Amcor                   674 Submitted        NA     public Austra…
##  6 Austra… AMP                     724 Submitted        NA     public Austra…
##  7 Austra… Ansell                  806 Submitted        NA     public Austra…
##  8 Austra… APA Group               802 Submitted        NA     public Austra…
##  9 Austra… APN Outdoor…          59305 Submitted        NA     public Austra…
## 10 Austra… Aurizon Hol…          36581 Submitted        NA     public Austra…
## # … with 1,794 more rows, and 11 more variables: response_received_date <chr>,
## #   activities <chr>, sectors <chr>, industries <chr>, primary_activity <chr>,
## #   primary_sector <chr>, primary_industry <chr>,
## #   primary_questionnaire_sector <chr>, isi_ns <chr>, tickers <chr>,
## #   authority_types <chr>
map(c_summary, names)
## [[1]]
##  [1] "organisation"         "account_no"           "discloser_id"        
##  [4] "country"              "gics_sector"          "gics_industry_group" 
##  [7] "gics_industry"        "gics_sub_industry"    "access"              
## [10] "primary_expansion"    "secondary_expansions" "complexity"          
## [13] "status"               "documents"           
## 
## [[2]]
##  [1] "organisation"         "account_no"           "discloser_id"        
##  [4] "country"              "gics_sector"          "gics_industry_group" 
##  [7] "gics_industry"        "gics_sub_industry"    "access"              
## [10] "primary_expansion"    "secondary_expansions" "complexity"          
## [13] "status"               "documents"           
## 
## [[3]]
##  [1] "organisation"          "account_no"            "country"              
##  [4] "gics_sector"           "gics_industry_group"   "gics_industry"        
##  [7] "gics_sub_industry"     "primary_expansion"     "secondary_expansions" 
## [10] "ticker"                "isin"                  "complexity"           
## [13] "status"                "submitted_date"        "public"               
## [16] "response_account_no"   "response_organisation" "version"              
## 
## [[4]]
##  [1] "organisation"          "account_no"            "country"              
##  [4] "gics_sector"           "gics_industry_group"   "gics_industry"        
##  [7] "gics_sub_industry"     "primary_expansion"     "secondary_expansions" 
## [10] "ticker"                "isin"                  "complexity"           
## [13] "status"                "submitted_date"        "public"               
## [16] "response_account_no"   "response_organisation" "version"              
## 
## [[5]]
##  [1] "organisation"          "account_no"            "country"              
##  [4] "gics_sector"           "gics_industry_group"   "gics_industry"        
##  [7] "gics_sub_industry"     "primary_expansion"     "secondary_expansions" 
## [10] "ticker"                "isin"                  "complexity"           
## [13] "status"                "submitted_date"        "ammended_date"        
## [16] "public"                "response_account_no"   "response_organisation"
## [19] "version"              
## 
## [[6]]
##  [1] "organisation"          "account_no"            "country"              
##  [4] "industry_group"        "primary_expansion"     "secondary_expansions" 
##  [7] "ticker"                "isin"                  "complexity"           
## [10] "status"                "submitted_date"        "ammended_date"        
## [13] "public"                "response_account_no"   "response_organisation"
## [16] "version"              
## 
## [[7]]
##  [1] "program_name"         "project_year"         "account_id"          
##  [4] "account_name"         "region"               "incorporated_country"
##  [7] "access"               "response_status"      "gri"                 
## [10] "secondary_expansion"  "ticker"               "isin"                
## [13] "submitted_date"      
## 
## [[8]]
##  [1] "country"                      "organization"                
##  [3] "account_number"               "invitation_status"           
##  [5] "region"                       "public"                      
##  [7] "samples"                      "response_received_date"      
##  [9] "activities"                   "sectors"                     
## [11] "industries"                   "primary_activity"            
## [13] "primary_sector"               "primary_industry"            
## [15] "primary_questionnaire_sector" "isi_ns"                      
## [17] "tickers"                      "authority_types"

####Use map to get cleaned names in c_summary

  • [[1]] are cleaned names in sheet Summary of cdp_2011.xlsx spreadsheet which contains data for 2010
  • [[8]] are cleaned names in the sheet Summary Data in the cdp_2018.xlsx spreadsheet which contains data for 2017

Put the cleaned column names into variables

Use output from map to identify column names

  • col_account
  • col_company
  • col_country
  • col_sector- inconsistent sector/industry definitions across years
# names after cleaning
col_account <- c(rep("account_no", 6), "account_id", "account_number")
col_account
## [1] "account_no"     "account_no"     "account_no"     "account_no"    
## [5] "account_no"     "account_no"     "account_id"     "account_number"
col_company <- c(rep("organisation", 6), "account_name", "organization")
col_company
## [1] "organisation" "organisation" "organisation" "organisation" "organisation"
## [6] "organisation" "account_name" "organization"
col_country <- c(rep("country", 6), "incorporated_country", "country")
col_country
## [1] "country"              "country"              "country"             
## [4] "country"              "country"              "country"             
## [7] "incorporated_country" "country"
col_sector <- c(rep("gics_sector", 5), "industry_group", "gri", "primary_industry")
col_sector
## [1] "gics_sector"      "gics_sector"      "gics_sector"      "gics_sector"     
## [5] "gics_sector"      "industry_group"   "gri"              "primary_industry"

Read in data put into variable df_summary

summary <- function(path_in, sheet_in, year_in, col_account, col_company, col_country,
                    col_sector) {
  cleaned <- read_excel(path = path_in, sheet = sheet_in) %>% clean_names()
  cleaned %>%
    mutate(year = year_in) %>%
    select(
      account = col_account, year, company = col_company, country = col_country,
      sector = col_sector
    ) %>%
    mutate(account = as.character(account)) # needed this b/c account was numeric in one or more years
}

l_summary <- list(
  path_in, sheet_in, year_in, col_account, col_company, col_country,
  col_sector
)

df_summary <- pmap_dfr(l_summary, summary)

# add a column to indicate these are public response (will need when merge with all responses)
df_summary_p <- df_summary %>% mutate(response = "public")

df_summary_p
## # A tibble: 14,551 x 6
##    account  year company                  country      sector           response
##    <chr>   <dbl> <chr>                    <chr>        <chr>            <chr>   
##  1 1800     2010 Bharat Petroleum Corpor… India        Energy           public  
##  2 5300     2010 EDP - Energias de Portu… Portugal     Utilities        public  
##  3 29900    2010 Ernst & Young LLP UK     United King… Banks            public  
##  4 10200    2010 Kizoo AG                 Germany      Information Tec… public  
##  5 22300    2010 Kungsleden AB            Sweden       Banks            public  
##  6 13400    2010 Nordea Bank              Sweden       Banks            public  
##  7 15400    2010 QLogic Corp.             USA          Information Tec… public  
##  8 32500    2010 Railway Procurement Age… Ireland      N/A              public  
##  9 23100    2010 Royal Wessanen NV        Netherlands  Consumer Staples public  
## 10 17900    2010 Sumitomo Metal Industri… Japan        Materials        public  
## # … with 14,541 more rows

Write dataframe df_summary to file

df_summary_p has 14551 rows.

write to csv file

df_summary_p %>% write_csv("~/Dropbox/data/cdp/investor/derived-data/df_summary.csv")

Next posts will:

  • clean all responses spreadsheet (part 2)
  • merge with df_summary_p
  • copy to SQL database
Avatar
Elizabeth Stanny
Professor of Business Administration

My research interests include climate change and political disclosures.

Related