This dataset encompasses two distinct sets of data analyzed in the study, namely Asian American Scholar Forum survey data and Microsoft Academic Graph bibleometrics data:
Yu Xie, Xihong Lin, Ju Li, Qian He, Junming Huang, Caught in the Crossfire: Fears of Chinese-American Scientists, Proceedings of the National Academy of Sciences, in press (2023).
This dataset contains example input files, training data sets and potential files related to the publication "First-principles-based Machine Learning Models for Phase Behavior and Transport Properties of CO2." by Mathur et al (2023). In this work, we developed machine learning models for CO2 based on different exchange-correlation DFT functionals. We assessed their performance on liquid densities, vapor-liquid equilibrium and transport properties.
Guo, Xuehui; Pan, Da; Daly, Ryan; Chen, Xi; Walker, John; Tao, Lei; McSpiritt, James; Zondlo, Mark
Abstract:
Gas-phase ammonia (NH3), emitted primarily from agriculture, contributes significantly to reactive nitrogen (Nr) deposition. Excess deposition of Nr to the environment causes acidification, eutrophication, and loss of biodiversity. The exchange of NH3 between land and atmosphere is bidirectional and can be highly heterogenous when underlying vegetation and soil characteristics differ. Direct measurements that assess the spatial heterogeneity of NH3 fluxes are lacking. To this end, we developed and deployed two fast-response, quantum cascade laser-based open-path NH3 sensors to quantify NH3 fluxes at a deciduous forest and an adjacent grassland separated by 700 m in North Carolina, United States from August to November, 2017. The sensors achieved 10 Hz precisions of 0.17 ppbv and 0.23 ppbv in the field, respectively. Eddy covariance calculations showed net deposition of NH3 (-7.3 ng NH3-N m−2 s−1) to the forest canopy and emission (3.2 ng NH3-N m−2 s−1) from the grassland. NH3 fluxes at both locations displayed diurnal patterns with absolute magnitudes largest midday and with smaller peaks in the afternoons. Concurrent biogeochemistry data showed over an order of magnitude higher NH3 emission potentials from green vegetation at the grassland compared to the forest, suggesting a possible explanation for the observed flux differences. Back trajectories originating from the site identified the upwind urban area as the main source region of NH3. Our work highlights the fact that adjacent natural ecosystems sharing the same airshed but different vegetation and biogeochemical conditions may differ remarkably in NH3 exchange. Such heterogeneities should be considered when upscaling point measurements, downscaling modeled fluxes, and evaluating Nr deposition for different natural land use types in the same landscape. Additional in-situ flux measurements accompanied by comprehensive biogeochemical and micrometeorological records over longer periods are needed to fully characterize the temporal variabilities and trends of NH3 fluxes and identify the underlying driving factors.
Numerical data is tabulated for all plots (Figures 2, 3a-b, 4-89, S1, S4a-b,d, S5a-b,d, S6-S156) and included as separate spreadsheets categorized by figure in a .zip file in the Supplementary Material. Error bars in Figure 4 show the spread of data observed for 4 and 5 trials on independent samples for MIL-101 and MOF-235, respectively. Figure 6a shows the average of triplicate filtrate test conversions with error propagated based on this spread. Figures 6b and S165 error bars on rate constants are determined based on propagated conversion uncertainty for independent trials and extracted standard deviations of pseudo-first order rate constants from linearized plots. Error bars on other plots represent propagation of experimental uncertainty on single trials.
Zhou, Mi; Peng, Liqun; Zhang, Lin; Mauzerall, Denise L.
Abstract:
This dataset is created for the paper titled 'Environmental Benefits and Household Costs of Clean Heating Options in Northern China' and published on Nature Sustainability. Based on a 2015 regional anthropogenic emission inventory (base case), we propose seven counterfactual scenarios in which all 2015 residential solid fuel heating in northern China switches to one of the following non-district heating options: clean coal with improved stoves (CCIS), natural gas heaters (NGH), resistance heaters (RH), or air-to-air heat pumps (AAHP). This dataset provides the following gridded information for the base case and each clean heating scenario: (1) annual residential heating emissions for PM2.5/NOx/SO2; (2) monthly mean surface PM2.5 concentrations from the WRF-Chem model; (3) annual PM2.5-related premature deaths calculated by the GEMM model; (4) 2015 population in China; (5) mask for provinces in China; (6) longitude and latitude of each grid center.
This distribution compiles numerous physical properties for 2,585 intrinsically disordered proteins (IDPs) obtained by coarse-grained molecular dynamics simulation. This combination comprises "Dataset A" as reported in "Featurization strategies for polymer sequence or composition design by machine learning" by Roshan A. Patel, Carlos H. Borca, and Michael A. Webb (DOI: 10.1039/D1ME00160D). The specific IDP sequences are sourced from version 9.0 of the DisProt database. The simulations were performed using the LAMMPS molecular dynamics engine. The interactions used for simulation are obtained from R. M. Regy , J. Thompson , Y. C. Kim and J. Mittal , Improved coarse-grained model for studying sequence dependent phase separation of disordered proteins, Protein Sci., 2021, 1371 —1379.
This item provides access to all configurations of single-chain nanoparticles analyzed in the manuscript "Sequence Patterning, Morphology, and Dispersity in Single-Chain Nanoparticles: Insights from Simulation and Machine Learning" by Roshan A. Patel, Sophia Colmenares, and Michael A. Webb (DOI: 10.1021/acspolymersau.3c00007). The single-chain nanoparticles derive from 320 unique precursor chains that are distinguished by the fraction of linker beads that decorate a fixed-length polymer backbone and the distribution or blockiness of those linker beads. The data is provided in the form of serialized object using the `pickle' python module. The data was compiled using Python version 3.8.8 and Clang 10.0.0. The Python object loaded from the .pkl file is a nested list, with the first dimension having 7,680 entries for the 7,680 unique single-chain nanoparticles produced in the aforementioned paper. Each of those 7,680 entries is itself a list with 20 entries, representing the 20 different simulation snapshots of the given single-chain nanoparticle. Each of the 20 entries is another list with two entries, with the first being a numpy.ndarray containing the x,y,z coordinates of all the beads comprising the single-chain nanoparticle and the second being a numpy.ndarray with a numerical encoding to indicate whether the beads are backbone (indicated as '0') or linker beads (indicated as '1'). Altogether, this provides 153,600 configurations of single-chain nanoparticles.