Wiley Interdisciplinary Reviews Data Mining and Knowledge Discovery Paper Submit

Abstract

Large datasets are now ubiquitous equally engineering science enables college-throughput experiments, only rarely tin can a research field truly benefit from the research data generated due to inconsistent formatting, undocumented storage or improper dissemination. Here we extract all the meaningful device data from peer-reviewed papers on metallic-halide perovskite solar cells published so far and make them available in a database. We collect information from over 42,400 photovoltaic devices with upwards to 100 parameters per device. We then develop open up-source and accessible procedures to analyse the data, providing examples of insights that can exist gleaned from the analysis of a large dataset. The database, graphics and assay tools are made available to the community and will go on to evolve equally an open-source initiative. This approach of extensively capturing the progress of an unabridged field, including sorting, interactive exploration and graphical representation of the data, will be applicable to many fields in materials science, engineering and biosciences.

Principal

The halide perovskites have for the final few years been the brightest shining stars on the sky of emerging solar jail cell materials. They have shown nifty potential in optoelectronic applications such as tandem solar cellsone,2,3,4,5, LEDs6,seven, lasers8, photodetectors9,ten, Ten-ray detectorsxi and for unmarried-junction solar cells the record certified ability conversion efficiency (PCE) has reached above 25% (ref. 12). The halide perovskite semiconductors thus represent a material class with considerable technological relevance where rapid development is occurring. There are, nonetheless, remaining problems related to, for example, stability13,xiv,15, scalability16,17,18,nineteen and reliability20; the all-time material combinations and manufacturing processes are open questions21,22, and cardinal standards and metrics are nevertheless under discussion23.

In the normal research cycle, researchers read papers, formulate hypotheses, generate information in the laboratory and publish new papers (Fig. i). With historic information and insights scattered over an inaccessibly large number of papers, this process is not every bit efficient equally it could be. At the time of writing, the keyword 'perovskite solar' does for instance find over 19,000 papers in the Web of Science, making it essentially impossible to keep up to date with the literature. The perovskite field could thus be said to have a data management problem at an aggregated level.

Fig. 1: Expanding the standard research cycle in experimental material science.
figure 1

An analogy of the standard research cycle and how the Perovskite Database Project can expand it by providing an open database, interactive visualization tools, protocols and a metadata ontology for reporting device information, open up-source code for information assay and so on. Solid data lines refer to data from published papers treated in this project. Dashed data lines refer to raw data from experimentation and analysed full datasets that are natural extensions to be included later. The dashed 'insight' lines represent the employ of the expanded research cycle.

Full size image

Data accept always been the foundation of empirical science, but with modern algorithms and artificial intelligence, entirely new opportunities emerge when data are collected in sufficiently large quantities and in a cohesive manner. Big data has become the lifeblood of the tech giants of Silicon Valley, the fuel for artificial intelligence and a cornerstone for the adjacent industrial revolution24. The field of materials science is in no mode oblivious to this development, and several data initiatives accept been initiated, for instance the Materials Project25, Aflow26, NOMAD27, the Crystallography Open up Database28, the emerging photovoltaic initiative29 and the inorganic crystal structure databasexxx, to mention a few. Despite these efforts, much of the experimental materials science is still struggling to make better apply of the information generated31, and notably so in applied fields where materials are often evaluated primarily by their performance in devices.

A concept of increasing importance is the Off-white data principles, that is, data should be findable, accessible, interoperable and reusable32,33. Adhering to those principles can accelerate the development and increase the return on investment as it enables cantankerous-assay betwixt datasets, data reuse, likewise as simplifying the use of artificial intelligence and machine learning. At that place is also an increased demand from government, funding agencies and journals to disseminate the underlying data accordingly. However, most laboratories are non able to adhere to the FAIR data principles, specially in the applied science fields. There are concurrent reasons behind this, including the lack of suitable data broadcasting platforms. Yet, the largest hurdle is the diversity and complexity of the datasets involved. For instance, sample properties are frequently influenced past the sample history. Furthermore, they are characterized using a large number of experimental techniques, which vary across disciplines. These pocket-size disconnected and heterogeneous datasets besides require a substantial corporeality of metadata to be of use.

In this project, henceforth referred to every bit the Perovskite Database Project, we accept initiated a communal lesser-upwardly effort to transform perovskite inquiry data direction. The Perovskite Database Project aims to expand the normal enquiry cycle by collecting all perovskite solar cell data, both past and future, in one place. Apart from making all historical data accessible and providing means to upload new experimental data, interactive graphical data visualization tools have been implemented that enable simple and interactive exploration, analysis and filtering (Fig. one). This platform will give both academic researchers and the manufacture an attainable overview of what has been washed before, and thereby help in finding relevant noesis gaps and formulating new scientific questions with the hope of generating new insights, designing better experiments, avoiding known dead ends and accelerating the charge per unit of development. The primal goals of the projection are to: collect all perovskite solar cell data always published in one open-access database; develop free interactive spider web-based tools for simple and interactive exploration, analysis, filtering and visualization of the information; develop procedures and protocols to simplify dissemination and collection of new perovskite information according to the Off-white data principles; release an open-source lawmaking base that can be used every bit a design for similar projects and give a few demonstrations of insights and analysis that can be easily done if all information are consistently formatted and found in ane identify.

Details of the database

We accept manually gone through every paper found in the Web of Scientific discipline with the search phrase 'perovskite solar' up to the end of February 2020 (that is, over fifteen,000 papers). In total, we have manually extracted data for over 42,400 devices. While a few devices with extractable data volition have slipped through our cyberspace, the devices in the database represent almost every device someone has thought is worth the effort to describe in detail in the peer-reviewed literature.

Our original data extraction protocol contained 95 attributes with metadata, procedure information and performance data. Those tin be grouped into: reference data; jail cell-related data; data for every functional layer in the device stack, that is, type of substrate, electron transport layer (ETL), perovskite, pigsty send layer (HTL), back contact and so on; synthesis related information for each layer and key metrics related to the performance of the resulting device; that is, current–voltage, quantum efficiency, stability and outdoor performance (Fig. 2). The categories and the formatting guidelines are described in detail in the supporting documentation. For future use, nosotros have developed a more detailed protocol capturing upward to 400 parameters per device, which tin can be found among the resources on the projection's webpage.

Fig. ii: Overview of data categories in the Perovskite Database.
figure 2

Overview of the primary categories of metadata, procedure data and performance data in the data extraction protocol. Iv, electric current–voltage. QE, breakthrough efficiency.

Full size image

Once extracted, the data have been consistently formatted according to the didactics in the supporting documentation and is now freely available in the Perovskite Database. To increase the usability of the data, we take developed interactive tools for simple exploration, analysis, filtering and visualization that can exist used without programming noesis. The code base for the project is written in Python and is bachelor at GitHub (https://github.com/Jesperkemist/perovskitedatabase), and everyone is invited to contribute and aggrandize the scope of the project. All the resources are found at the projection website (www.perovskitedatabase.com), where they volition be updated and maintained for the foreseeable time to come.

With all the device data consistently formatted and available in 1 place, a plethora of interesting possibilities opens. What follows is a small selection of analyses, visualizations and insights fabricated possible by the Perovskite Database and the associated toolbox.

Instance uses of the Perovskite Database

As a offset example, the perovskite solar cell evolution is illustrated by binning the functioning for all available devices and plotting those equally a function of publication date (Fig. 3a). This demonstrates the expected tendency towards college-performing devices, as well as offering a sense of the underlying variability by showing the performance distribution, and thereby providing a comprehensive view of the field'southward progress.

Fig. iii: Evolution of perovskite cell efficiencies.
figure 3

Example of analysis from the database. a, Hexbin-plot of PCE measured under standard conditions as a function of the publication appointment for all devices in the database. Efficiency distribution for all devices is shown to the right. b, Evolution of record efficiency of all cells, flexible cells and CsPbIiii-based cells. Information from the NREL efficiency nautical chart are given as a comparison. c, Jail cell efficiency as a role of the publication date for slot-die-coated perovskites separated by the solvent used for perovskite deposition. DMF, dimethylformamide; DMSO, dimethylsulfoxide; GBL, gamma-butyrolactone; NMP, Northward-methyl-2-pyrrolidinone. d, Boilerplate operation and popularity of a MAPbI3, FA x MA1-x PbBr y Iiii-y and Cs z FA 10 MA1-x-z PbBr y Iiii-y perovskite compositions as a part of time.

Full size image

The National Renewable Energy Laboratory (NREL) efficiency chart is probably one of the most reproduced images in the photovoltaic field. It is a highly trustable source equally it exclusively relies on externally certified results, but is also limited in scope. The tendency in global records illustrated in the NREL chart tin can easily be reproduced (Fig. 3b), even if some of the information points are different as they are sorted on publication engagement and include non-certified data. What makes this genuinely interesting is the possibility to filter out the records for any type of prison cell. With a single mouse click, it is possible to display the operation evolution of, for example, flexible cells, cells based on CsPbIiii or cells fulfilling whatever combination of constraints (Fig. 3b). With an additional click, the effigy tin can be downloaded and directly incorporated in presentations, applications or in a scientific publication. Clicking on a data point will also redirect the user to the original publication, which is a curt-cut when searching for papers on a specific topic of interest.

A typical use case could exist someone starting a projection on a particular fabrication method, for example, slot-die blanket. In the Perovskite Database, one simple command filters out the data for all available devices with slot-die-coated perovskites. Those data can be obtained in tabular form and downloaded with a click that gives an entry indicate to the key literature for further exploration. Once the relevant subset of information is obtained, it can be separated with respect to any of the dimensions represented in the database. To mention a few examples, these tin can be the perovskite doping atmospheric condition, the use of flexible substrates or, every bit shown in Fig. 3c, the solvent system used during the deposition of the perovskite. This represents a complex literature search that previously required a substantial amount of non-trivial work, but which can at present be accomplished and visualized in a few minutes. With this insight at manus, it is just every bit easy to go along and explore additional questions, such every bit what is the importance of the annealing temperature, the choice of hole conductor, the antisolvent or to what extent does the perovskite limerick influence the key performance metrics of the device? This illustrates a powerful brusk-cut towards extracting the historical information relevant for a projection, for generating new hypotheses, for finding unexplored areas, for knowledge transfer and for acquiring insights otherwise easily overlooked.

With the aggregated data, information technology is too possible to visualize trends of how various experimental practices take been developed over the past years. An example is given in Fig. 3d that illustrates how the popularity of a few perovskite compositions, that is, MAPbI3, FA 10 MA1-x PbBr y I3-y and Cs z FA x MA1-x-z PbBr y I3-y , have developed over fourth dimension. That figure embodies both a technical aspect of device optimization, only also the more sociological aspect of how experimental practices and ideas spread through a scientific community.

The data collected in the Perovskite Database demonstrate great flexibility to how a functional perovskite solar cell can exist constructed. Among the 42,400 devices plant in the database at the time of writing, at that place are over 5,500 unique device stacks (that is, different combinations of contact materials), non considering the more 400 different families of perovskite compositions (that is, different combinations of the A, B and C-site ions in the perovskite ABC3-structure). More than 1,000 of these stacks accept champion PCEs to a higher place 18%, and more than 300 take demonstrated PCEs above 20%. The multitude of stacks tin can be broken down into 1,443 unique ETL stacks, 1,957 HTL stacks, 288 back contact configurations and 194 different substrates. Some options are, however, more than common than others. Around 60% of all devices are, for example, based on methylammonium atomic number 82 iodide (MAPbI3), and the ten about mutual HTLs are used in 85% of all devices, with Spiro-MeOTAD (C81H68NivO8) used in close to half of them.

A trouble faced while developing perovskite solar cells, which is in no way unique for the perovskite field, are cell-to-cell and batch-to-batch variations. Those can be big, thus masking otherwise statistically significant differences. There are also laboratory-to-laboratory variations, and what appears to make a meaning difference in ane laboratory may not exist relevant in some other. This is unremarkably ascribed to undescribed, unexplored, unknown or hidden parameters that might influence, for example, the crystallization dynamics of the perovskite moving-picture show34. Those could be things such every bit glove box book, precise atmospheric limerick during fabrication, minor or unintended variations in precursor stoichiometry35,36, chemical impurities37 and so on to mention a few hypotheses. The Perovskite Database tin mitigate that trouble past combining all the available disseminated device data. That allows for more holistic conclusions about what works, what does not and how reliable and consistent various procedures are. This is illustrated with a few examples below.

In Fig. 4a, the kernel density estimation, that is, the smoothed average, of the open-excursion voltage (V oc) is given for 3 common HTLs. For a fair comparison, only MAPbIthree-based devices are included. Information technology turns out that the hole conductor has a notable impact on the V oc that can be expected on average, which is an example of something that is difficult to verify with a limited number of samples produced in a single laboratory but becomes credible with such all-encompassing data. The figure also indicates that Spiro-MeOTAD may be associated with a small 5 oc loss, in line with recent discussions concerning interface recombination38, and thus not exist the best choice of pigsty conductor from a performance indicate of view, and the success for Spiro-MeOTAD may be more an effect of a historical coincidence, statistics and it having been heavily optimized rather than information technology having the highest intrinsic potential. Some other instance is given in Fig. 4b, which compares degradation procedures for TiO2 based ETLs in nip-devices with a MAPbI3 perovskite and Spiro-MeOTAD as HTL, which are the well-nigh common ETL and HTL stacks. The very best cells have been washed using spin-coated mesoporous TiO2 but on an aggregated level the choice of degradation procedure has a fairly minor bear on and all the depicted deposition procedures have resulted in a large spread in device functioning. Excluding the mesoporous TiO2 layer does not make much of a difference either for the average cell functioning, which is interesting given that the very best cells still use a mesoporous TiO2-layer.

Fig. four: Example of assay from the database.
figure 4

a, The kernel density estimation (KDE) of the V oc for three mutual HTLs for MAPbI3-based devices. b, Performance distributions separated past deposition procedures for the TiO2-ETL in nip devices with MAPbI3 and Spiro-MeOTAD. The top console include all cells with a compact TiOii layer but without a mesoporous TiOtwo. The remaining panels include cells with both meaty (-c) and mesoporous (-mp) TiOtwo layers and are separated past the degradation procedure for each layer. The solid lines are the kernel density estimates. c, The experimental and fitted bandgap for the FA x MAane-x PbBr y Ithree-y system. The groundwork colour represents the fitted surface, the white lines are isolines and points represent the experimental information. The colour bar represents the bandgap in eV. The colour scheme gives special emphasis to outliers.

Total size epitome

The previous examples illustrate the power of having access to large, various, consistently formatted and interoperable datasets. They are besides only scratching the surface while raising new questions that invite further explorations by digging deeper into the data. We conceptualize this dataset will exist an fantabulous resource for hereafter piece of work in perovskite groups as well as in the broader machine learning and data science communities.

One of the technologically highly-seasoned aspects of the metal-halide perovskites is the tunability of the bandgap (E m), which ranges from below 1.2 eV for MAPb0.fiveSn0.vIthree (ref. 39), to in a higher place 3 eV for MAPbClthree (ref. 40). One way to use the collected bandgap data is to filter out perovskite compositions in a desired bandgap range. Another is to extrapolate the band gap of previously unexplored compositions, equally illustrated in Fig. 4c. Here a second-caste polynomial has been fitted to the bandgap values in the database relating to composition in the FA 10 MA1-x PbBr y I3-y system. Conversely, in such a compositional space, a simple optical measurement could and so exist used to guess the perovskite composition. With the analysis lawmaking freely bachelor, a plumbing fixtures process such equally that in Fig. 4c could hands be done for any compositional range where sufficient data are available and it can be updated whenever new data are made available.

Nigh devices have been made with perovskites with a bandgap of around 1.55–1.65 eV (Fig. 5a). That is where MAPbIiii is found and it is the virtually interesting region for perovskite single-junction cells. For tandem integration, the need for optical matching between the subcells ways that higher bandgaps are required for the height jail cell41. Unfortunately, from a tandem perspective, there is a drop in performance when the bandgap increases higher up roughly 1.eight eV, with the trend continuing up to 2.three eV (Fig. 5a). This is primarily caused past an increased V oc loss, which probably originates from a light-induced partial stage separation in mixed Br/I-perovskites42, sometimes referred to equally the Hoke effect43.

Fig. five: Identification of central challenges in the development of perovskite solar cells.
figure 5

Remaining key challenges. a, PCE versus E thou for all solar cells in the database. The Shockley–Queisser limit is given as a solid line. b, Illustration of perovskite scalability. c, T fourscore versus publication date for devices measured nether AM one.5 and MPPT. The solid line represents a linear fit to information.

Full size image

When comparing the performance as a function of the perovskite bandgap in more detail, some results are institute to exist unphysical equally they surpass the Shockley–Queisser (SQ) limit, near frequently in terms of a as well large short-excursion electric current. Some of those points can be explained by mislabelled or misreported bandgaps, whereas others may exist caused by errors in light source scale and aperture surface area. Nevertheless, this illustrates a neglect of basic error checking in historic reports.

Another major challenge towards commercial viability is scalability. Virtually laboratory cells take an active area ≤0.ii cmtwo, and it is also for these small cells where the highest efficiencies are found. When the cell area increases, there is a downwards trend in maximum performance (Fig. 5b), with a spike at i cmii, which is a mutual cell area used in the commencement step towards upscaling. The average performance is rather constant with respect to the device area. The reasons for this are unclear, merely a possible explanation could exist the limited number of cells larger than 5 cmii reported so far and that upscaling is primarily pursued by groups already producing high-quality small-scale devices.

Long-term stability nether operational conditions is a key requirement for any photovoltaic technology, and anyone making perovskite devices, particularly with early methods and recipes, apace realizes that this will be a claiming. There is, nonetheless, less than 20% of the cells in the database for which stability information of whatsoever kind are available. At the time of writing, the Perovskite Database contains seven,400 entries with stability data, and 5,500 of those are variations of shelf life in the nighttime, where devices are stored and remeasured over time. At that place are effectually 550 entries with measurements nether operational atmospheric condition, that is, air mass (AM) 1.5G and maximum power bespeak tracking (MPPT). Historical comparison of stability is complicated both past the scarcity of high-quality data and by a lack of mutual standards and protocols for measuring and reporting stability data. This is, however, changing due to an active word in the field, which recently resulted in a list of International Acme on Organic Photovoltaic Stability (ISOS) consensus protocols related to measuring and reporting of stability data23. The Perovskite Database Project is fully uniform with those ISOS protocols.

There is not one unmarried key metric of device stability but several, all with their own merits and limitations. I of the more commonly used is the T 80 value, which is the fourth dimension it takes for a jail cell to lose 20% of its initial performance. In Fig. 5c, the T 80 versus publication date is given for the almost 120 devices in the database measured nether AM 1.5 and MPPT, and where a T 80 is stated (that is, less than 0.3% of all cells). There is a general trend towards more than devices with higher stabilities as the years progress, even if nosotros even so have rather few data points. Given the importance of the problem, we await a dramatic increase in reporting this type of data in the adjacent few years.

Figure 5 represents a first glimpse of what is institute in the Perovskite Database related to the three core technological challenges, namely tandem integration, scalability and stability. All these aspects deserve a much longer analysis, and nosotros expect a multitude of papers to be written based on these open-source resources, both by us and by others. We intend the Perovskite Database to be a living, evolving and scalable project, and we wait future work to expand the scope of the project past calculation new data, functionality, analysis, visualizations and open up-source code.

Time to come expansion of the database

The ambition of the Perovskite Database Project is to collect not only historic data but all future device data likewise, to create a new standard for disseminating perovskite device information and to build what we can think of as the Wikipedia of perovskite solar cell research. This will require participation from the unabridged perovskite community, with a mental shift towards a culture where everyone feels that they can, want and will disseminate their device data by uploading it to the Perovskite Database as a complement to traditional publishing.

Uploading new data volition take some time and attempt. The Perovskite Database Projection must therefore deliver a high degree of perceived employ, simplicity, visibility, longevity and trustworthiness. In terms of use, we hope the examples in this paper, together with the interactive graphics on the projection's website, have demonstrated the power of aggregated datasets adhering to the Fair data principles, and that this alone provides an incentive to contribute. There are also other benefits to uploading one's own data. Sharing data in this way gives it new life and draws additional attending to the original publication, it is a way to comply with the demands for openness more frequently seen from taxpayers, funding agencies and publishers, and information technology is a service to the customs that helps to advance the development of new solar cell technology. Finally, the tools and protocols we provide may help in organizing and improving the local data management and thereby, in the finish, simplify planning, assay and writing.

In terms of simplicity, we accept developed intuitive and well-documented information extraction protocols. The backend for data cleaning and validation is written in Python, and the backend for collecting and reporting information is currently in the form of an Excel template. The Excel template is cocky-explanatory, easy to utilise, freely available and possible to extend to fit different laboratories' internal needs. By being transparent and freely available, it is possible to build customized data pipelines that directly feed data from laboratory equipment into the template, thereby simplifying data entry even farther.

Our vision is that uploading data into databases such as this one volition become standard procedure equally this volition strengthen the associated publication by increasing its visibility and usefulness. We further anticipate involving publishers as important stakeholders in this projection. Making experimental data assessible on platforms used by nigh of the inquiry customs will increase the visibility of scientific results. In addition, the aggregating of all device data allows an straightforward assessment every bit to whether reported device operation metrics are physically possible (for instance, that are in the expected performance limits of the Shockley–Queisser limit for unmarried-junction solar cells) or deviate substantially from common trends.

To ensure the project's longevity, nosotros have secured support from the Helmholtz Arrangement in Deutschland, which acts every bit a guarantor ensuring that the web resources, that is, database, webpage and the GitHub account, volition be operational and maintained for the coming decade, with an option of possible prolongation.

Another key aspect related to trustworthiness is the open-source nature of the project, which means transparency, to which users could suggest improvements and provide additional functionality, and it enables like shooting fish in a barrel restart in case of disruption.

The database could also easily be expanded to include data relevant to, for example, LEDs, lasers, scintillators so on, and nosotros actively encourage initiatives in that management.

A key problem addressed in this project is the challenge of keeping track of the field's progress when information are inconsistently formatted and scattered over an inaccessible large number of papers. A related trouble is data loss, or the iceberg problem44,45. In a typical project, at that place may be hundreds and sometimes thousands of devices made before the paper is written. Despite this, the boilerplate number of devices for which nosotros could extract data was fewer than six per publication with original device data. A mutual pattern is that one parameter is changed in few steps, and for each of those steps data for the best device could be found. Some of the data for the missing devices are presented as statistical averages, fifty-fifty if the data for the private devices cannot exist extracted from the papers. Data for other devices are, for various reasons, never disseminated and are substantially lost forever. Data for nearly of the best devices are probably disseminated, but there is a wealth of information subconscious in the data now lost44,45. With the tools here developed, we facilitate reporting data for as well those kinds of device in time to come reports, which could mitigate the bias for non disseminating information for failed experiments and less successful devices.

Conclusions

In this Perovskite Database Projection, we have created an open-access database for perovskite solar cell device data and visualization tools for interactive data exploration, and we have populated the database with information for over 42,000 devices described in the peer-reviewed literature upward until leap 2020. We likewise demonstrate the capabilities of the database and the associated tools past giving a few examples of insights that can exist gleaned from the analysis of this large dataset in terms of, for case, record development, tandem integration, stability and scalability. We hope that this projection will prompt better data management in the perovskite field too as a culture of data sharing, as well equally inspiring other experimental fields to practice the same. Nosotros could then get data with a more fine-grained data mesh and brand those information available for most devices ever made, non just a few highlighted in papers as has been the case historically. In a few years, we could then have data for millions of devices, which will enable us to finally take greater advantage of motorcar learning and other artificial intelligence-based methods to accelerate development fifty-fifty further.

Methods

The search phrase 'perovskite solar' in the Web of Science generated over xv,000 entries by the finish of February 2020. Not all of those publications relate to metallic-halide perovskites and photovoltaic applications, but most exercise. Similarly, a few relevant papers will be missed in this search. From here, our collective team has manually gone through every paper and extracted data for all the described devices.

Of the publications we went through, we found original experimental device data in close to half of them, that is, around seven,400. Amid the remaining papers, we found reviews, theoretical investigations and studies focused on material backdrop, as well as some non-photovoltaic-/perovskite-related publications. In total, nosotros have manually extracted information for over 42,400 devices. The total time consumption to do this is in the range of 5,000–ten,000 human hours.

On the basis of our collective feel of perovskite device development and optimization, the full number of devices ever made is probably at least two orders of magnitude larger, but for data for most of those devices cannot be extracted from the publications. In fact, data for about devices are only available as average values, in scatterplots or not disseminated at all.

1 database entry per device has been the default procedure, but if but averaged data were plant, nosotros entered that as belonging to one cell but specified the number of devices the averaged is based on. Another guiding principle has been that, while preferably having all possible data for a device, having some information is ameliorate than having none. We have thus not discarded data based on poor or limited device descriptions in the scientific publications. Nosotros also considered a best judge of a perovskite composition, for example, to exist worth more than stating the information every bit unknown, which for example could exist the case for solvent-based ion commutation procedure where the ionic fractions in the perovskite cannot be derived from the composition of the precursor solutions, but where it tin can be inferred from optical or Ten-ray diffraction data.

All data contain errors. That is unavoidable. Some sources of errors include: the data stated in the original papers are erroneous due to several possible reasons; misinterpretation of data, which is easily done when papers are cryptic or confusingly written, and errors while transferring data from papers to the database. We have therefore set up upward a system for reporting dubious information points, and we thereby await some self-correction over time, especially for data points of special interest such equally records in subfields. To reduce the errors, we went through the extracted information to check for errors, misunderstandings, disruptive entries and inconsistent formatting. For future information, where we expect authors to upload their own data, we wait a lower mistake rate than for the historical dataset. It is, nonetheless, advisable to double check outliers, peculiarly when the applied search filters generate small datasets, so as non to draw erroneous conclusions. Nosotros also encourage authors, who know their own information best, to double check their devices in the database.

Every data betoken in the database is linked to the DOI number of the original publication. Every information point is thus effectively cited in the database, and for everyone who uses the information constitute there information technology is straightforward to utilize this DOI linkage to both find and cite the original sources of the data used.

Information availability

The projection has a dedicated website, www.perovskitedatabase.com that provide access to all resources. Among those are: the Perovskite Database, interactive graphics exploring the database, instructions for what is institute in the database, templates and instructions for uploading new information, links to all works related to the project and then on.

Code availability

Codes reproducing all analyses in this paper are available in the following GitHub repository at https://github.com/Jesperkemist/perovskitedatabase.

References

  1. Al-Ashouri, A. et al. Monolithic perovskite/silicon tandem solar cell with >29% efficiency past enhanced hole extraction. Science 370, 1300–1309 (2020).

    Commodity  Google Scholar

  2. Snaith, H. J. Perovskites: the emergence of a new era for depression-price, loftier-efficiency solar cells. J. Phys. Chem. Lett. 4, 3623–3630 (2013).

    Article  Google Scholar

  3. Bailie, C. D. et al. Semi-transparent perovskite solar cells for tandems with silicon and CIGS. Free energy Environ. Sci. 8, 956–963 (2015).

    Article  Google Scholar

  4. Albrecht, S. et al. Monolithic perovskite/silicon-heterojunction tandem solar cells processed at low temperature. Energy Environ. Sci. 9, 81–88 (2016).

    Article  Google Scholar

  5. Jošt, M., Kegelmann, L., Korte, L. & Albrecht, Southward. Monolithic perovskite tandem solar cells: a review of the present condition and advanced characterization methods toward 30% efficiency. Adv. Energy Mater. 10, 1904102 (2020).

  6. Tan, Z.-K. et al. Bright low-cal-emitting diodes based on organometal halide perovskite. Nat. Nanotechnol. 9, 687–692 (2014).

    Article  Google Scholar

  7. Van Le, Q., Jang, H. Due west. & Kim, South. Y. Recent advances toward high‐efficiency halide perovskite light‐emitting diodes: review and perspective. Small Methods 2, 1700419 (2018).

    Article  Google Scholar

  8. Deschler, F. et al. High photoluminescence efficiency and optically pumped lasing in solution-processed mixed halide perovskite semiconductors. J. Phys. Chem. Lett. 5, 1421–1426 (2014).

    Article  Google Scholar

  9. Domanski, K. et al. Working principles of perovskite photodetectors: analyzing the coaction betwixt photoconductivity and voltage-driven energy-level alignment. Adv. Func. Mater. 25, 6936–6947 (2015).

    Article  Google Scholar

  10. Ahmadi, M., Wu, T. & Hu, B. A review on organic–inorganic halide perovskite photodetectors: device engineering and fundamental physics. Adv. Mater. 29, 1605242 (2017).

    Commodity  Google Scholar

  11. Kraus, H., Mykhaylyk, V. & Saliba, M. Bright and fast scintillation of organolead perovskite MAPbBriii at depression temperatures. Mater. Horiz. half dozen, 1740–1747 (2019).

  12. Green, M. A. et al. Solar cell efficiency tables (version 56). Prog. Photovolt. Res. Appl. 28, 629–638 (2020).

    Commodity  Google Scholar

  13. Wali, Q. et al. Advances in stability of perovskite solar cells. Org. Electron. 78, 105590 (2020).

    Article  Google Scholar

  14. Krishnan, U., Kaur, M., Kumar, M. & Kumar, A. Factors affecting the stability of perovskite solar cells: a comprehensive review. J. Photon. Free energy 9, 021001 (2019).

    Google Scholar

  15. Howard, J. K., Tennyson, E. M., Neves, B. R. & Leite, M. S. Machine learning for perovskites' reap-residual-recovery bike. Joule 3, 325–337 (2019).

    Commodity  Google Scholar

  16. Park, North.-Thousand. & Zhu, G. Scalable fabrication and coating methods for perovskite solar cells and solar modules. Nat. Rev. Mater. 5, 333–350 (2020).

    Article  Google Scholar

  17. Qiu, L., He, S., Ono, Fifty. Yard., Liu, Due south. & Qi, Y. Scalable fabrication of metal halide perovskite solar cells and modules. ACS Energy Lett. iv, 2147–2167 (2019).

    Article  Google Scholar

  18. Swartwout, R., Hoerantner, K. T. & Bulović, V. Scalable deposition methods for large‐area production of perovskite thin films. Energy Environ. Mater. ii, 119–145 (2019).

    Article  Google Scholar

  19. Matteocci, F., Castriotta, L. A. & Palma, A. L. in Photoenergy and Sparse Pic Materials (ed. Yang, X.-Y.) 121–155 (Wiley, 2019).

  20. Li, Northward., Niu, Ten., Chen, Q. & Zhou, H. Towards commercialization: the operational stability of perovskite solar cells. Chem. Soc. Rev. 49, 8235–8286 (2020).

  21. Howard, I. A. et al. Coated and printed perovskites for photovoltaic applications. Adv. Mater. 31, 1806702 (2019).

    Article  Google Scholar

  22. Mathies, F., List-Kratochvil, E. J. & Unger, E. L. Advances in inkjet‐printed metal halide perovskite photovoltaic and optoelectronic devices. Energy Technol. 8, 1900991 (2020).

    Commodity  Google Scholar

  23. Khenkin, M. 5. et al. Consensus statement for stability assessment and reporting for perovskite photovoltaics based on ISOS procedures. Nat. Energy 5, 35–49 (2020).

    Article  Google Scholar

  24. Schwab, K. & Davis, Northward. Shaping the Future of the Fourth Industrial Revolution (Currency, 2018).

  25. Jain, A. et al. Commentary: the materials project: a materials genome approach to accelerating materials innovation. APL Mater. 1, 011002 (2013).

    Article  Google Scholar

  26. Curtarolo, South. et al. AFLOW: an automatic framework for high-throughput materials discovery. Comp. Mater. Sci. 58, 218–226 (2012).

    Commodity  Google Scholar

  27. Draxl, C. & Scheffler, M. The NOMAD laboratory: from data sharing to bogus intelligence. J. Phys. Mater. 2, 036001 (2019).

    Commodity  Google Scholar

  28. Gražulis, South. et al. Crystallography Open up Database–an open-admission collection of crystal structures. J. Appl. Crystallogr. 42, 726–729 (2009).

    Article  Google Scholar

  29. Almora, O. et al. Device performance of emerging photovoltaic materials (version one). Adv. Energy. Mater. 11, 2002774 (2020).

  30. Bergerhoff, M., Brown, I. D. & Allen, F. Crystallographic Databases (International Union of Crystallography (1987).

  31. Empty rhetoric over data sharing slows science. Nature 546, 327 (2017).

  32. Wilkinson, One thousand. D. et al. The FAIR Guiding Principles for scientific data management and stewardship. Sci. Information 3, 160018 (2016).

  33. Draxl, C. & Scheffler, M. NOMAD: the Off-white concept for large information-driven materials science. MRS Bull. 43, 676–682 (2018).

    Commodity  Google Scholar

  34. Zeng, L. et al. Decision-making crystallization dynamics of photovoltaic perovskite layers on larger-area coatings. Energy Environ. Sci. 13, 4666–4690 (2020).

    Article  Google Scholar

  35. Jacobsson, T. J. et al. Unreacted PbI2 equally a double-edged sword for enhancing the performance of perovskite solar cells. J. Am. Chem. Soc. 138, 10331–10343 (2016).

    Article  Google Scholar

  36. Fassl, P. et al. Fractional deviations in precursor stoichiometry dictate the backdrop, functioning and stability of perovskite photovoltaic devices. Energy Environ. Sci. 11, 3380–3391 (2018).

    Article  Google Scholar

  37. Zhang, Y. et al. Achieving reproducible and high-efficiency (>21%) perovskite solar cells with a presynthesized FAPbI3 pulverisation. ACS Energy Lett. five, 360–366 (2019).

    Article  Google Scholar

  38. Gharibzadeh, S. et al. Record open up‐excursion voltage wide‐bandgap perovskite solar cells utilizing 2D/3D perovskite heterostructure. Adv. Energy Mater. nine, 1803699 (2019).

    Article  Google Scholar

  39. Ogomi, Y. et al. CH3NH3SnxPb(i-x)I3 Perovskite solar cells roofing up to 1,060 nm. J. Phys. Chem. Lett. 5, 1004–1011 (2014).

    Article  Google Scholar

  40. Liu, D., Yang, C. & Lunt, R. R. Halide perovskites for selective ultraviolet-harvesting transparent photovoltaics. Joule 2, 1827–1837 (2018).

    Commodity  Google Scholar

  41. Jacobsson, T. J. et al. 2-Terminal CIGS-perovskite tandem cells: a layer past layer exploration. Sol. Energy 207, 270–288 (2020).

    Article  Google Scholar

  42. Jacobsson, T. J. et al. Exploration of the compositional space for mixed lead halogen perovskites for loftier efficiency solar cells. Free energy Environ. Sci. 9, 1706–1724 (2016).

    Article  Google Scholar

  43. Hoke, E. T. et al. Reversible photo-induced trap formation in mixed-halide hybrid perovskites for photovoltaics. Chem. Sci. 6, 613–617 (2015).

    Article  Google Scholar

  44. Heidorn, P. B. Shedding light on the dark data in the long tail of scientific discipline. Libr. Trends 57, 280–299 (2008).

    Commodity  Google Scholar

  45. Raccuglia, P. et al. Machine-learning-assisted materials discovery using failed experiments. Nature 533, 73–76 (2016).

    Article  Google Scholar

Download references

Acknowledgements

The core funding of the project has been received from the Eu'southward Horizon 2020 research and innovation program nether grant agreement no. 787289. We acknowledge MaterialsZone (https://www.materials.zone/) for technical aid and for hosting the project's deject resources. We admit Helmholtz-Zentrum Berlin für Materialien und Energie for guaranteeing economic and technical back up for keeping the projection online for the adjacent decade. We admit the following sources for private funding. Cambridge India Ramanujan Scholarship, Red china Scholarship Council, Deutscher Akademischer Austauschdienst (DAAD), EPSRC (grant no. EP/S009213/1), European Marriage's Horizon 2020 inquiry and innovation programme (grant no. 764787, EU Project 'MAESTRO'), (grant no. 756962, ERC Project 'HYPERION'), (grant no. 764047, EU Project 'ESPResSo' and grant no. 850937), GCRF/EPSRC SUNRISE (EP/P032591/1), German Federal Ministry for Education and Research (BMBF), HyPerFORME, NanoMatFutur (grant no. 03XP0091). PEROSEED (ZT-0024), Helmholtz Free energy Materials Foundry, The Helmholtz Innovation Laboratory HySPRINT. BMBF (grant nos. 03SF0540, 03SF0557A), HyPerCells graduate school, Helmholtz Association, Helmholtz International Research Schoolhouse (HI-SCORE), the Erasmus programme (CDT-PV, grant no. EP/L01551X/1), the European union'due south Horizon 2020 inquiry and innovation programme (Marie Skłodowska-Curie grant agreement nos. 841386, 795079 and 840751), Royal Gild University Enquiry Fellowship (grant no. UF150033). SNaPSHoTs (BMBF), SPARC II, German Inquiry Foundation (DFG, grant no. SPP2196), The National Natural Scientific discipline Foundation of People's republic of china (grant no. 51872014), the Recruitment Plan of Global Experts, Fundamental Research Funds for the Central Universities and the '111' project (grant no. B17002), the U.s. Department of Energy'due south Office of Energy Efficiency and Renewable Energy under Solar Free energy Technologies Office (SETO) agreement no. DE-EE0008551, the Colombia Scientific Programme in the framework of the call Ecosistema Cientifíco (Contract no. FP44842-218-2018), the commission for the development of research (CODI) of the Universidad de Antioquia (grant no. 2017-16000), Spanish MINECO (Severo Ochoa programme, grant no. SEV‐2015‐0522), the Swedish research council (VR, grant no. 2019-05591) and the Swedish Energy Bureau (grant no. 2020-005194).

Funding

Open access funding provided by Helmholtz-Zentrum Berlin für Materialien und Energie GmbH.

Writer information

Affiliations

Contributions

T.J.J. and E.U. designed the projection. T.J.J. coordinated the projection, wrote much of the code for the interactive graphics and wrote the first typhoon of the paper. Thou.V., A.Y.A. and O.Y. worked on coding. T.J.J., A.H., A.G.-F., A. Anand, A.A-A., A.H., A.C., A. Abate, A.G.R., A.Five., A.Thou., B.P.D., B.Y., B.L.C., C.A.R.P., C.R., D.R., D.F.-J., D.D.G., D.J., E.A., E.J.J.-P., F.B., F.Grand., K.S.A.1000., Chiliad.B., 1000.Due north., G.P., Grand.G.-D., H.North., H.Chiliad., H.Yard., H.W., I.B., M.I.D., I.B.P., I.E.G., J.N.V., J.D., J.K., J.Y., J.L., J.A.S., J.P., J.J.J.-R., J.F.M., J.-P.C-B., J.Q., J.W., K.S., K.H., G.D., Chiliad.F., L.1000., L.A.C., M.H.A., M.5.-Thousand., M.A.R.-P., M.A.F., Thou.5.Thousand., M.G., M.K., M.S., M.A., N.A., O.South., O.M., O.South.M., P.F., Q.Z., R.B., R.M., R.P., Southward.Due south., Southward.A., S.K., T.U., T.A., T.Eastward., T.W.D., U.W.P., W.Z., W.F., W.Z., V.R.F.S., W.T., X.Z., Y-.H.C., Z.I., Z.X. and Eastward.U. all contributed to the laborious task of going through the literature, extracting the data found there and formatting consistently. All authors have participated in preparing the final draft of the paper.

Respective authors

Correspondence to T. Jesper Jacobsson or Eva Unger.

Ethics declarations

Competing interests

MaterialsZone is a spider web platform used for managing, standardizing, sharing and analysing data in the field of Materials Scientific discipline, and is aimed at researchers in both academia and industry. In this projection, MaterialsZone worked in collaboration with Helmholtz-Zentrum Berlin to brand the Perovskite Database Projection easily attainable to anyone interested in these data in an open up and convenient style. The people who participated in this project from MaterialsZone are: A.Y.A (Primary Executive Officer of the company, and PhD in Materials Science), O.Y. (Master Technology Officer of the company and PhD in Mathematics) and Yard.Five. (Senior Programmer and Data Scientist, and PhD in Experimental Physics). The remaining authors declare no competing interests.

Additional information

Peer review data Nature Free energy thanks Chris Deline, Sang Il Seok, Marina Leite and the other, anonymous, reviewer(south) for their contribution to the peer review of this work.

Publisher's note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open up Access This article is licensed under a Artistic Commons Attribution 4.0 International License, which permits apply, sharing, accommodation, distribution and reproduction in any medium or format, as long as you give advisable credit to the original writer(s) and the source, provide a link to the Creative Eatables license, and indicate if changes were made. The images or other third party material in this article are included in the article'due south Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article'southward Creative Eatables license and your intended employ is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Jacobsson, T.J., Hultqvist, A., García-Fernández, A. et al. An open-access database and assay tool for perovskite solar cells based on the FAIR data principles. Nat Free energy seven, 107–115 (2022). https://doi.org/x.1038/s41560-021-00941-iii

Download citation

  • Received:

  • Accepted:

  • Published:

  • Effect Date:

  • DOI : https://doi.org/x.1038/s41560-021-00941-3

Further reading

swansoncomay1938.blogspot.com

Source: https://www.nature.com/articles/s41560-021-00941-3

0 Response to "Wiley Interdisciplinary Reviews Data Mining and Knowledge Discovery Paper Submit"

Post a Comment

Iklan Atas Artikel

Iklan Tengah Artikel 1

Iklan Tengah Artikel 2

Iklan Bawah Artikel