Complete dataset of pore water chemical parameters measured at the Marsh Resource Meadowlands Mitigation Bank, a tidal marsh within the New Jersey Meadowlands, from March 2011 to April 2012. Analytes measured include dissolved methane, sulfate, dissolved organic carbon, temperature, salinity, and pH. Measurements were conducted using porewater dialysis samplers, and water was sampled from the surface to a depth of 60 cm.
Martin, Nicholas R; Blackman, Edith; Bratton, Benjamin P; Chase, Katelyn J; Bartlett, Thomas M; Gitai, Zemer
Abstract:
Bacterial species have diverse cell shapes that enable motility, colonization, and virulence. The cell wall defines bacterial shape and is primarily built by two cytoskeleton-guided synthesis machines, the elongasome and the divisome. However, the mechanisms producing complex shapes, like the curved-rod shape of Vibrio cholerae, are incompletely defined. Previous studies have reported that species-specific regulation of cytoskeleton-guided machines enables formation of complex bacterial shapes such as cell curvature and cellular appendages. In contrast, we report that CrvA and CrvB are sufficient to induce complex cell shape autonomously of the cytoskeleton in V. cholerae. The autonomy of the CrvAB module also enables it to induce curvature in the Gram-negative species Escherichia coli, Pseudomonas aeruginosa, Caulobacter crescentus, and Agrobacterium tumefaciens. Using inducible gene expression, quantitative microscopy, and biochemistry we show that CrvA and CrvB circumvent the need for patterning via cytoskeletal elements by regulating each other to form an asymmetrically-localized, periplasmic structure that directly binds to the cell wall. The assembly and disassembly of this periplasmic structure enables dynamic changes in cell shape. Bioinformatics indicate that CrvA and CrvB may have diverged from a single ancestral hybrid protein. Using fusion experiments in V. cholerae, we find that a synthetic CrvA/B hybrid protein is sufficient to induce curvature on its own, but that expression of two distinct proteins, CrvA and CrvB, promotes more rapid curvature induction. We conclude that morphological complexity can arise independently of cell shape specification by the core cytoskeleton-guided synthesis machines.
Data from the 2007 Developmental Idealism survey conducted in Gansu province in China's northwestern borderlands reveal that Muslims of the Hui and Dongxiang ethnicities reported much higher rates of cohabitation experience than the secular majority Han. Based on follow-up qualitative interviews, we found the answer to lie in the interplay between the highly interventionist Chinese state and the robust cultural resilience of local Islamic communities. Using the 2000 census data and the 2010 China Family Panel Studies data, we further show that women in almost all ten Muslim ethnic groups have higher percentages of underage births and premarital births than Han women, both nationally and in the northwest where most Chinese Muslims live. As the once-outlawed behavior of cohabitation became more socially acceptable during the reform and opening-up era, young Muslim Chinese often found themselves in “arranged cohabitations” as de facto marriages formed at younger-than-legal ages.
This dataset encompasses three distinct sets of data analyzed in the study, namely the survey data on favorability to the US, the survey data on trust in Americans, and the social media data.
Hepatitis B virus (HBV) infection remains a major public health problem and, in associated co-infection with hepatitis delta virus (HDV), causes the most severe viral hepatitis and accelerated liver disease progression. As a defective satellite RNA virus, HDV can only propagate in the presence of HBV infection, which makes HBV DNA and HDV RNA the standard biomarkers for monitoring the virological response upon antiviral therapy, in co-infected patients. Although assays have been described to quantify these viral nucleic acids in circulation independently, a method for monitoring both viruses simultaneously is not available, thus hampering characterization of their complex dynamic interactions. Here, we describe the development of a dual fluorescence channel detection system for pan-genotypic, simultaneous quantification of HBV DNA and HDV RNA through a one-step quantitative PCR. The sensitivity for both HBV and HDV is about 10 copies per microliter without significant interference between these two detection targets. This assay provides reliable detection for HBV and HDV basic research in vitro and in human liver chimeric mice. Preclinical validation of this system on serum samples from patient on or off antiviral therapy also illustrates a promising application that is rapid and cost-effective in monitoring HBV and HDV viral loads simultaneously.
Physical and biogeochemical variables from the NOAA-GFDL Earth System Model 2M experiments (pre-processed), previously published observation-based datasets, and code to reproduce figures from these datasets, used for the study 'Hydrological cycle amplification reshapes warming-driven oxygen loss in Atlantic Ocean'.
Microscopy images are part of a paper entitled "Structured foraging of soil predators unveils functional responses to bacterial defenses" by Fernando Rossine, Gabriel Vercelli, Corina Tarnita, and Thomas Gregor. For detailed acquisition methods see the paper. Experiments were performed between 2019 and 2020 at Princeton University. Two types of images are provided, macroscopic and microscopic widefiled Images. Macroscopic images all show Petri dishes covered in fluorescent bacteria being consumed by amoebae. Images are shown for D. discoideum, P. violaceum, and A. castellanii. Images depicting drug treatments (Nystatin and Fluorouracil) were obtained using D. discoideum. Images used for the creation of a profile were all taken within 30 minutes of each other. Within each directory numbered images are independent replicates. The raw video directory contains time series for dishes under drug treatments. Each numbered folder is a sequence of photos (taken 30 minutes apart of each other) of a single dish. Microscopic images all show amoebae consuming bacteria on a petri dish. The 45 minute videos show either edge cells (located at the edge of amoebae colonies), or inner cells (located 2.5 millimeters towards the center of the colony, from the edge). Videos are confocal stacks, with bacteria showing in green and amoebae appearing as black holes within the bacterial lawn. As was for the macroscopic images, images are shown for D. discoideum, P. violaceum, and A. castellanii. Images depicting drug treatments (Nystatin and Fluorouracil) were obtained using D. discoideum.
Bhattacharjee, Tapomoy; Amchin, Daniel; Alert, Ricard; Ott, Jenna; Datta, Sujit
Abstract:
Collective migration -- the directed, coordinated motion of many self-propelled agents -- is a fascinating emergent behavior exhibited by active matter that has key functional implications for biological systems. Extensive studies have elucidated the different ways in which this phenomenon may arise. Nevertheless, how collective migration can persist when a population is confronted with perturbations, which inevitably arise in complex settings, is poorly understood. Here, by combining experiments and simulations, we describe a mechanism by which collectively migrating populations smooth out large-scale perturbations in their overall morphology, enabling their constituents to continue to migrate together. We focus on the canonical example of chemotactic migration of Escherichia coli, in which fronts of cells move via directed motion, or chemotaxis, in response to a self-generated nutrient gradient. We identify two distinct modes in which chemotaxis influences the morphology of the population: cells in different locations along a front migrate at different velocities due to spatial variations in (i) the local nutrient gradient and in (ii) the ability of cells to sense and respond to the local nutrient gradient. While the first mode is destabilizing, the second mode is stabilizing and dominates, ultimately driving smoothing of the overall population and enabling continued collective migration. This process is autonomous, arising without any external intervention; instead, it is a population-scale consequence of the manner in which individual cells transduce external signals. Our findings thus provide insights to predict, and potentially control, the collective migration and morphology of cell populations and diverse other forms of active matter.
Pan, Da; Gelfand, Ilya; Tao, Lei; Abraha, Michael; Sun, Kang; Guo, Xuehui; Chen, Jiquan; Robertson, G. Philip; Zondlo, Mark A.
Abstract:
This dataset contains spectroscopic simulations, experimental results for the 2202 cm-1 N2O absorption line, and N2O flux measurements shown in "A New Open-path Eddy Covariance Method for N2O and Other Trace Gases that Minimizes Temperature Corrections" by Da Pan, Ilya Gelfand, Lei Tao, Michael Abraha, Kang Sun, Xuehui Guo, Jiquan Chen, G. Philip Robertson, and Mark A. Zondlo. The HITRAN Application Programming Interface (HAPI) with HITRAN 2016 was used for spectroscopic simulations. Experiments were conducted to quantify H2O-broadened half-width at half maximum and validate spectroscopic simulations. N2O flux was measured with both eddy covariance and static chamber methods.
Elevated reactive nitrogen (Nr) deposition is a concern for alpine ecosystems, and dry NH3 deposition is a key contributor. Understanding how emission hotspots impact downwind ecosystems through dry NH3 deposition provides opportunities for effective mitigation. However, direct NH3 flux measurements with sufficient temporal resolution to quantify such events are rare. Here, we measured NH3 fluxes at Rocky Mountain National Park (RMNP) during two summers and analyzed transport events from upwind agricultural and urban sources in northeastern Colorado. We deployed open-path NH3 sensors on a mobile laboratory and an eddy covariance tower to measure NH3 concentrations and fluxes. Our spatial sampling illustrated an upslope event that transported NH3 emissions from the hotspot to RMNP. Observed NH3 deposition was significantly higher when backtrajectories passed through only the agricultural region (7.9 ng m-2 s-1) versus only the urban area (1.0 ng m-2 s-1) and both urban and agricultural areas (2.7 ng m-2 s-1). Cumulative NH3 fluxes were calculated using observed, bidirectional modeled, and gap-filled fluxes. More than 40% of the total dry NH3 deposition occurred when air masses were traced back to agricultural source regions. More generally, we identified that 10 (25) more national parks in the U.S. are within 100 (200) km of an NH3 hotspot, and more observations are needed to quantify the impacts of these hotspots on dry NH3 depositions in these regions.
This dataset encompasses two distinct sets of data analyzed in the study, namely Asian American Scholar Forum survey data and Microsoft Academic Graph bibleometrics data:
Yu Xie, Xihong Lin, Ju Li, Qian He, Junming Huang, Caught in the Crossfire: Fears of Chinese-American Scientists, Proceedings of the National Academy of Sciences, in press (2023).
This dataset encompasses three distinct sets of data analyzed in the study, namely the survey data on favorability to the US, the survey data on trust in Americans, and the social media data.
The first part of the dataset comprises the analysis in Study 1 and Study 3, which is collected from three surveys, including the Social Attitude Questionnaire of Urban and Rural Residents (SAQURR) in 2019 and 2020, the COVID-19 Multi-Wave Study (CMWS) between 2020 and 2022, and the Survey on Living Conditions (SLC) in 2023.
The second part of the datasets provides information used in Study 4, involving the 2018 and 2020 waves of the CFPS, Baidu Index data, and the COVID-19 cases and deaths data.
The third dataset is provided to depict trends in attitudes toward the US in Study 2.
This dataset contains example input files, training data sets and potential files related to the publication "First-principles-based Machine Learning Models for Phase Behavior and Transport Properties of CO2." by Mathur et al (2023). In this work, we developed machine learning models for CO2 based on different exchange-correlation DFT functionals. We assessed their performance on liquid densities, vapor-liquid equilibrium and transport properties.
Guo, Xuehui; Pan, Da; Daly, Ryan; Chen, Xi; Walker, John; Tao, Lei; McSpiritt, James; Zondlo, Mark
Abstract:
Gas-phase ammonia (NH3), emitted primarily from agriculture, contributes significantly to reactive nitrogen (Nr) deposition. Excess deposition of Nr to the environment causes acidification, eutrophication, and loss of biodiversity. The exchange of NH3 between land and atmosphere is bidirectional and can be highly heterogenous when underlying vegetation and soil characteristics differ. Direct measurements that assess the spatial heterogeneity of NH3 fluxes are lacking. To this end, we developed and deployed two fast-response, quantum cascade laser-based open-path NH3 sensors to quantify NH3 fluxes at a deciduous forest and an adjacent grassland separated by 700 m in North Carolina, United States from August to November, 2017. The sensors achieved 10 Hz precisions of 0.17 ppbv and 0.23 ppbv in the field, respectively. Eddy covariance calculations showed net deposition of NH3 (-7.3 ng NH3-N m−2 s−1) to the forest canopy and emission (3.2 ng NH3-N m−2 s−1) from the grassland. NH3 fluxes at both locations displayed diurnal patterns with absolute magnitudes largest midday and with smaller peaks in the afternoons. Concurrent biogeochemistry data showed over an order of magnitude higher NH3 emission potentials from green vegetation at the grassland compared to the forest, suggesting a possible explanation for the observed flux differences. Back trajectories originating from the site identified the upwind urban area as the main source region of NH3. Our work highlights the fact that adjacent natural ecosystems sharing the same airshed but different vegetation and biogeochemical conditions may differ remarkably in NH3 exchange. Such heterogeneities should be considered when upscaling point measurements, downscaling modeled fluxes, and evaluating Nr deposition for different natural land use types in the same landscape. Additional in-situ flux measurements accompanied by comprehensive biogeochemical and micrometeorological records over longer periods are needed to fully characterize the temporal variabilities and trends of NH3 fluxes and identify the underlying driving factors.
Numerical data is tabulated for all plots (Figures 2, 3a-b, 4-89, S1, S4a-b,d, S5a-b,d, S6-S156) and included as separate spreadsheets categorized by figure in a .zip file in the Supplementary Material. Error bars in Figure 4 show the spread of data observed for 4 and 5 trials on independent samples for MIL-101 and MOF-235, respectively. Figure 6a shows the average of triplicate filtrate test conversions with error propagated based on this spread. Figures 6b and S165 error bars on rate constants are determined based on propagated conversion uncertainty for independent trials and extracted standard deviations of pseudo-first order rate constants from linearized plots. Error bars on other plots represent propagation of experimental uncertainty on single trials.
These files contain code used to segment D. virilis acoustic duets, quantification of courtship behaviors during acoustic duets, and measurements of duet song features.
This dataset is created for the paper titled 'Co-benefits of Transport Demand Reductions from Compact Urban Development in Chinese Cities' and published on Nature Sustainability. We construct 6 scenarios of compact urban development, alternative energy vehicle deployment, and power decarbonization to explore the co-benefits of transport demand reductions via compact urban development for carbon emissions, energy use, air quality, and human health in China in 2050. This dataset provides the following gridded information for the scenarios: (1) monthly mean surface PM2.5 concentrations from the WRF-Chem model; (2) annual PM2.5-related premature deaths calculated by the GEMM model; (3) 2015 population in China; (4) mask for provinces in China; (5) longitude and latitude of each grid center.
Zhou, Mi; Peng, Liqun; Zhang, Lin; Mauzerall, Denise L.
Abstract:
This dataset is created for the paper titled 'Environmental Benefits and Household Costs of Clean Heating Options in Northern China' and published on Nature Sustainability. Based on a 2015 regional anthropogenic emission inventory (base case), we propose seven counterfactual scenarios in which all 2015 residential solid fuel heating in northern China switches to one of the following non-district heating options: clean coal with improved stoves (CCIS), natural gas heaters (NGH), resistance heaters (RH), or air-to-air heat pumps (AAHP). This dataset provides the following gridded information for the base case and each clean heating scenario: (1) annual residential heating emissions for PM2.5/NOx/SO2; (2) monthly mean surface PM2.5 concentrations from the WRF-Chem model; (3) annual PM2.5-related premature deaths calculated by the GEMM model; (4) 2015 population in China; (5) mask for provinces in China; (6) longitude and latitude of each grid center.
This distribution compiles numerous physical properties for 2,585 intrinsically disordered proteins (IDPs) obtained by coarse-grained molecular dynamics simulation. This combination comprises "Dataset A" as reported in "Featurization strategies for polymer sequence or composition design by machine learning" by Roshan A. Patel, Carlos H. Borca, and Michael A. Webb (DOI: 10.1039/D1ME00160D). The specific IDP sequences are sourced from version 9.0 of the DisProt database. The simulations were performed using the LAMMPS molecular dynamics engine. The interactions used for simulation are obtained from R. M. Regy , J. Thompson , Y. C. Kim and J. Mittal , Improved coarse-grained model for studying sequence dependent phase separation of disordered proteins, Protein Sci., 2021, 1371 —1379.
This item provides access to all configurations of single-chain nanoparticles analyzed in the manuscript "Sequence Patterning, Morphology, and Dispersity in Single-Chain Nanoparticles: Insights from Simulation and Machine Learning" by Roshan A. Patel, Sophia Colmenares, and Michael A. Webb (DOI: 10.1021/acspolymersau.3c00007). The single-chain nanoparticles derive from 320 unique precursor chains that are distinguished by the fraction of linker beads that decorate a fixed-length polymer backbone and the distribution or blockiness of those linker beads. The data is provided in the form of serialized object using the `pickle' python module. The data was compiled using Python version 3.8.8 and Clang 10.0.0. The Python object loaded from the .pkl file is a nested list, with the first dimension having 7,680 entries for the 7,680 unique single-chain nanoparticles produced in the aforementioned paper. Each of those 7,680 entries is itself a list with 20 entries, representing the 20 different simulation snapshots of the given single-chain nanoparticle. Each of the 20 entries is another list with two entries, with the first being a numpy.ndarray containing the x,y,z coordinates of all the beads comprising the single-chain nanoparticle and the second being a numpy.ndarray with a numerical encoding to indicate whether the beads are backbone (indicated as '0') or linker beads (indicated as '1'). Altogether, this provides 153,600 configurations of single-chain nanoparticles.
Khanna, Jaya; Medvigy, David; Fueglistaler, Stephan; Walko, Robert
Abstract:
More than 20% Amazon rainforest has been cleared in the past three decades triggering important hydroclimatic changes. Small-scale (~few kilometers) deforestation in the 1980s has caused thermally-triggered atmospheric circulations that increase regional cloudiness and precipitation frequency. However, these circulations are predicted to diminish as deforestation increases. Here we use multi-decadal satellite records and numerical model simulations to show a regime shift in the regional hydroclimate accompanying increasing deforestation in Rondônia, Brazil. Compared to the 1980s, present-day deforested areas in downwind western Rondônia are found to be wetter than upwind eastern deforested areas during the local dry season. The resultant precipitation change in the two regions is approximately ±25% of the deforested area mean. Meso-resolution simulations robustly reproduce this transition when forced with increasing deforestation alone, showing a negligible role of large-scale climate variability. Furthermore, deforestation-induced surface roughness reduction is found to play an essential role in the present-day dry season hydroclimate. Our study illustrates the strong scale-sensitivity of the climatic response to Amazonian deforestation and suggests that deforestation is sufficiently advanced to have caused a shift from a thermally- to a dynamically-driven hydroclimatic regime.
This dataset provides the data generated during the project analyzing ‘Food Consumption Strategies for Addressing Air Pollution, Climate Change, Water Use, and Public Health in China’. It includes the code for generating the alternative dietary scenarios, for analyzing the health impacts of alternative diets, and for visualization of results.
The dataset contains the model file for the Global Adjoint Tomography Model 25 (GLAD-M25). The model file contains parameters defined on the spectral-element mesh and is recommend to be used in SPECFEM3D GLOBE for seismic wave simulation at the global scale.
Chen, Xu; Li, Zhongshu; Gallagher, Kevin P.; Mauzerall, Denise L.
Abstract:
Power sector decarbonization requires a fundamental redirection of global finance from fossil fuel infrastructure towards low carbon technologies. Bilateral finance plays an important role in the global energy transition to non-fossil energy, but an understanding of its impact is limited. Here, for the first time, we compare the influence of overseas finance from the three largest economies – United States, China, and Japan – on power generation development beyond their borders and evaluate the associated long-term CO2 emissions. We construct a new dataset of Japanese and U.S. overseas power generation finance between 2000-2018 by analyzing their national development finance institutions’ press releases and annual reports and tracking their foreign direct investment at the power plant level. Synthesizing this new data with previously developed datasets for China, we find that the three countries’ overseas financing concentrated in fossil fuel power technologies over the studied period. Financing commitments from China, Japan, and the United States facilitated 101 GW, 95 GW, and 47 GW overseas power capacity additions, respectively. The majority of facilitated capacity additions are fossil fuel plants (64% for China, 87% for Japan, and 66% for the United States). Each of the countries’ contributions to non-hydro renewable generation was less than 15% of their facilitated capacity additions. Together, we estimate that overseas fossil fuel power financing through 2018 from these three countries will lock in 24 Gt CO2 emissions by 2060. If climate targets are to be met, replacing bilateral fossil fuel financing with financing of renewable technologies is crucial.
There has been considerable recent interest in the high-pressure behavior of silicon carbide, a potential major constituent of carbon-rich exoplanets. In this work, the atomic-level structure of SiC was determined through in situ X-ray diffraction under laser-driven ramp compression up to 1.5 TPa; stresses more than seven times greater than previous static and shock data. Here we show that the B1-type structure persists over this stress range and we have constrained its equation of state (EOS). Using this data we have determined the first experimentally based mass-radius curves for a hypothetical pure SiC planet. Interior structure models are constructed for planets consisting of a SiC-rich mantle and iron-rich core. Carbide planets are found to be ~10% less dense than corresponding terrestrial planets.
Geyman, Emily C.; Wu, Ziman; Nadeau, Matthew D.; Edmonsond, Stacey; Turner, Andrew; Purkis, Sam J.; Howes, Bolton; Dyer, Blake; Ahm, Anne-Sofie C.; Yao, Nan; Deutsch, Curtis A.; Higgins, John A.; Stolper, Daniel A.; Maloof, Adam C.
Abstract:
Carbonate mud represents one of the most important geochemical archives for reconstructing ancient climatic, environmental, and evolutionary change from the rock record. Mud also represents a major sink in the global carbon cycle. Yet, there remains no consensus about how and where carbonate mud is formed. In this contribution, we present new geochemical data that bear on this problem, including stable isotope and minor and trace element data from carbonate sources in the modern Bahamas such as ooids, corals, foraminifera, and green algae.
The carbon isotopic (δ13C) composition of shallow-water carbonates often is interpreted to reflect the δ13C of the global ocean and is used as a proxy for changes in the global carbon cycle. However, local platform processes, in addition to meteoric and marine diagenesis, may decouple carbonate δ13C from that of the global ocean. To shed light on the extent to which changing sediment grain composition may produce δ13C shifts in the stratigraphic record, we present new δ13C measurements of benthic foraminifera, solitary corals, calcifying green algae, ooids, coated grains, and lime mud from the modern Great Bahama Bank (GBB). This survey of a modern carbonate environment reveals δ13C variability comparable to the largest δ13C excursions in the last two billion years of Earth history.
The history of organismal evolution, seawater chemistry, and paleoclimate is recorded in layers of carbonate sedimentary rock. Meter-scale cyclic stacking patterns in these carbonates often are interpreted as representing sea level change. A reliable sedimentary proxy for eustasy would be profoundly useful for reconstructing paleoclimate, since sea level responds to changes in temperature and ice volume. However, the translation from water depth to carbonate layering has proven difficult, with recent surveys of modern shallow water platforms revealing little correlation between carbonate facies (i.e., grain size, sedimentary bed forms, ecology) and water depth. We train a convolutional neural network with satellite imagery and new field observations from a 3,000 km2 region northwest of Andros Island (Bahamas) to generate a facies map with 5 m resolution. Leveraging a newly-published bathymetry for the same region, we test the hypothesis that one can extract a signal of water depth change, not simply from individual facies, but from sequences of facies transitions analogous to vertically stacked carbonate strata. Our Hidden Markov Model (HMM) can distinguish relative sea level fall from random variability with ∼90% accuracy. Finally, since shallowing-upward patterns can result from local (autogenic) processes in addition to forced mechanisms such as eustasy, we search for statistical tools to diagnose the presence or absence of external forcings on relative sea level. With a new data-driven forward model that simulates how modern facies mosaics evolve to stack strata, we show how different sea level forcings generate characteristic patterns of cycle thicknesses in shallow carbonates, providing a new tool for quantitative reconstruction of ancient sea level conditions from the geologic record.
Chronic hepatitis B (CHB), caused by hepatitis B virus (HBV), remains a major medical problem. HBV has a high propensity for progressing to chronicity and can result in severe liver disease, including fibrosis, cirrhosis and hepatocellular carcinoma. CHB patients frequently present with viral coinfection, including HIV and hepatitis delta virus. About 10% of chronic HIV carriers are also persistently infected with HBV which can result in more exacerbated liver disease. Mechanistic studies of HBV-induced immune responses and pathogenesis, which could be significantly influenced by HIV infection, have been hampered by the scarcity of immunocompetent animal models. Here, we demonstrate that humanized mice dually engrafted with components of a human immune system and a human liver supported HBV infection, which was partially controlled by human immune cells, as evidenced by lower levels of serum viremia and HBV replication intermediates in the liver. HBV infection resulted in priming and expansion of human HLA-restricted CD8+ T cells, which acquired an activated phenotype. Notably, our dually humanized mice support persistent coinfections with HBV and HIV which opens opportunities for analyzing immune dysregulation during HBV and HIV coinfection and preclinical testing of novel immunotherapeutics.
This dataset contains input files, training data and other files related to the machine learning models developed during the work by Muniz et al. In this work, we construct machine learning models based on the MB-pol many-body model. We find that the training set should include cluster configurations as well as liquid phase configurations in order to accurately represent both liquid and VLE properties. The results attest for the ability of machine learning models to accurately represent many-body potentials and provide an efficient avenue for water simulations.
This dataset contains input and output files to reproduce the results of the manuscript "Homogeneous ice nucleation in an ab initio machine learning model" by Pablo M. Piaggi, Jack Weis, Athanassios Z. Panagiotopoulos, Pablo G. Debenedetti, and Roberto Car (arXiv preprint https://arxiv.org/abs/2203.01376). In this work, we studied the homogeneous nucleation of ice from supercooled liquid water using a machine learning model trained on ab initio energies and forces. Since nucleation takes place over times much longer than the simulation times that can be afforded using molecular dynamics simulations, we make use of the seeding technique that is based on simulating an ice cluster embedded in liquid water. The key quantity provided by the seeding technique is the size of the critical cluster (i.e., a size such that the cluster has equal probabilities of growing or shrinking at the given supersaturation). Using data from the seeding simulations and the equations of classical nucleation theory we compute nucleation rates that can be compared with experiments.