atomate 使用 介绍
firetask 细节会在提交后生成的 *submit*
文本文件中查看
custodian 默认的纠错次数上限为 5
atomate workflow 的自定义设计代码主要由 FireWorks 包控制
输入参数形成的输入文件代码主要由 pymatgen 包控制
输出文件中的数据提取、绘图及其他高级分析主要由 pymatgen 包控制
Workflow,Firework(Firework 的列表可称做 fireworks),Firetask(一个 Firework 由若干个基本的 Firetask 组成)
使用 lpad
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 -i -s -m -t -d lpad report lpad rerun_fws -i 3 lpad rerun_fws -s FIZZLED lpad get_fws lpad get_fws -i 3 lpad get_fws -s FIZZLED lpad get_fws -s FIZZLED -m 5 lpad get_fws -s FIZZLED -d more lpad get_wflows lpad get_wflows -i 1 lpad get_wflows -s FIZZLED -d more lpad get_wflows -s FIZZLED -t -m 5 lpad get_launchdir fw_id pause_fws resume_fws defuse_wflows reignite_wflows archive_wflows delete_wflows lpad reset lpad init
qlaunch
1 2 3 No jobs exist in the LaunchPad for submission to queue No READY jobs detected
1 2 3 4 5 6 qlaunch (-r) rapidfire qlaunch rapidfire --nlaunches 5 qlaunch singleshot -q
rlaunch
1 2 rlaunch singleshot rlaunch rapidfire
高通量正确计算完成时,custodian.json 文件无纠错
使用 tips
用 Python 脚本生成 workflows 的 fireworks 后,需要用 qlaunch 相关命令将 fireworks 提交到队列系统中,对于只有一个 firework 的 workflows(如弛豫和静态计算),若共生成了N 个 fireworks ,qlaunch rapidfire --nlaunches N
即可(体系较小时,N 可缩减成 N/2 等)
对于有多个 fireworks(如 M 个)的 workflows(如弹性常数计算),可以先提前了解这些多个 fireworks 之间的逻辑关系,若共有N 个 workflows ,可先 qlaunch rapidfire --nlaunches N
,N 个中有部分 fireworks(如 X 个)计算完成后,可适当再 qlaunch rapidfire --nlaunches X*(M-1)
,进行该 workflow 其余部分 fireworks 的计算,一定程度上可以控制计算成本(虽然可能需要时不时查看 fireworks 的计算完成情况)
**不建议直接 qlaunch rapidfire
**(只要作业未结束,生成其他的 workflow,会自动到队列中等待,计算目录容易混淆)
atomate 无法在只将 workflow 产生后就能看到输入文件,需让其实际运行才能看到;做法:核数设为 1;运行后待输入文件产生,将 Jobid 删除,检查输入文件参数
案例 静态计算
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 from atomate.common.powerups import add_namefile, add_tagsfrom atomate.vasp.workflows.presets.core import wf_staticfrom fireworks.core.launchpad import LaunchPadfrom pymatgen.core.structure import Structurestructure = Structure.from_file("POSCAR" ) wf = wf_static(structure) wf = add_namefile(wf) wf = add_tags(wf, {"task_name" : "atomate static workflow test" }) lpad = LaunchPad.auto_load() lpad.add_wf(wf) print ("The static test workflow is added." )
弛豫计算
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 from atomate.common.powerups import add_namefile, add_tagsfrom atomate.vasp.workflows.presets.core import wf_structure_optimizationfrom fireworks.core.launchpad import LaunchPadfrom pymatgen.core.structure import Structurestructure = Structure.from_file("POSCAR" ) wf = wf_structure_optimization(structure) wf = add_namefile(wf) wf = add_tags(wf, {"task_name" : "atomate relaxation workflow test" }) lpad = LaunchPad.auto_load() lpad.add_wf(wf) print ("The relaxation test workflow is added." )
弹性常数计算
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 from atomate.common.powerups import add_namefile, add_tagsfrom atomate.vasp.workflows.presets.core import wf_elastic_constantfrom fireworks.core.launchpad import LaunchPadfrom pymatgen.core.structure import Structurestructure = Structure.from_file("POSCAR" ) wf = wf_elastic_constant(structure=structure) wf = add_namefile(wf) wf = add_tags(wf, {"task_name" : "atomate elastic constant workflow test" }) lpad = LaunchPad.auto_load() lpad.add_wf(wf) print ("The elastic constant test workflow is added." )
1 2 3 4 5 Ni-elastic structure optimization--78 Ni-elastic deformation 0--77 ... Ni-elastic deformation 5--72 Analyze Elastic Data--71
运行弹性常数计算 workflow 时,若部分变形的 firework 计算结束,部分 fizzled,它会先根据已计算的变形 firework 的数据进行弹性常数拟合,因此需检查该 workflow 中的所有 firework 是否都计算完成并检验结果是否合理
atomate 计算弹性常数得到的弹性张量中 POSCAR-format (raw) 与 IEEE-format (ieee_format) 之间的区别:
1 2 3 4 5 6 7 8 9 10 atomate/vasp/workflows/presets/core.py atomate/vasp/workflows/base/elastic.py atomate/vasp/firetasks/parse_outputs.py
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 {"ENCUT" : 700 , "EDIFF" : 1e-6 , "LAECHG" : False , "LREAL" : False } {"ISIF" : 2 , "IBRION" : 2 , "NSW" : 99 , "ISTART" : 1 } Kpoints.automatic_density(structure, 40000 , force_gamma=True ) stencils = np.linspace(-0.075 , 0.075 , 7 ) stencil = np.arange(0.01 , 0.01 * order, step=0.01 ) if conventional: structure = SpacegroupAnalyzer(structure).get_conventional_standard_structure() uis_elastic = {"IBRION" : 2 , "NSW" : 99 , "ISIF" : 2 , "ISTART" : 1 } vis = vasp_input_set or MPStaticSet(structure, user_incar_settings=uis_elastic) strains = [] if strain_states is None : strain_states = get_default_strain_states(order) if stencils is None : stencils = [np.linspace(-0.01 , 0.01 , 5 + (order - 2 ) * 2 )] * len (strain_states) if np.array(stencils).ndim == 1 : stencils = [stencils] * len (strain_states) for state, stencil in zip (strain_states, stencils): strains.extend([Strain.from_voigt(s * np.array(state)) for s in stencil])
MongoDB Compass 使用 数据库连接
使用
1 { "tags.structure_id" : "ICET-Training-No-00754" }
atomate 连接 MongoDB,数据获取与筛选
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 import osfrom atomate.vasp.database import VaspCalcDbdb_json_path = ... db_json_path = os.getenv("DB_JSON_PATH" ) atomate_db = VaspCalcDb.from_db_file(db_json_path) elasticity_collection = atomate_db.db["elasticity" ] gibbs_collection = atomate_db.db["gibbs_tasks" ] query = {"task_label" : "volume relaxation" } query = {"task_id" : {"$gt" : 18 , "$lt" : 44 }} query = {"tags.solute" : {"$in" : solute_list}} query = {"completed_at" : {"$regex" : "2022-08-05 *" }} projection = { "_id" : 0 , "dir_name" : 1 , "task_id" : 1 , "completed_at" : 1 , "state" : 1 , "task_label" : 1 , "formula_reduced_abc" : 1 , "run_stats" : 1 , "input" : 1 , "output" : 1 , "tags" : 1 , } count = atomate_db.collection.count_documents(query) count = atomate_db.collection.aggregate([{"$match" : query}, {"$count" : "total" }]) count = db.collection.find(query).count() documents = atomate_db.collection.find(query, projection) document = atomate_db.collection.find_one(query, projection)
可用 Projection Operators:Query and Projection Operators - MongoDB Manual v7.0
find() manual:db.collection.find() - MongoDB Manual v7.0
find()
或 find_one()
返回的结果是 pymongo.cursor
对象,可以将其转化成 json 或 dataframe 的形式
参考链接:https://www.geeksforgeeks.org/convert-pymongo-cursor-to-json/ ;https://www.geeksforgeeks.org/convert-pymongo-cursor-to-dataframe
判断 find()
或 find_one()
返回的结果是否是空的
参考链接:https://www.geeksforgeeks.org/how-to-check-if-the-pymongo-cursor-is-empty
统计 key 的个数:https://stackoverflow.com/questions/12536592/mongodb-iterate-over-collection-by-key
https://github.com/hackingmaterials/atomate/issues/445
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 input_structure_dict = document["input" ]["structure" ] output_structure_dict = document["output" ]["structure" ] structure = Structure.from_dict(...) energy = document["output" ]["energy" ] energy_pa = document["output" ]["energy_per_atom" ] document["run_stats" ]["overall" ]["Elapsed time (sec)" ] document["run_stats" ]["overall" ]["Total CPU time used (sec)" ] dir_name = document["dir_name" ] calc_path = (re.search("/dssg.*" , dir_name)).group() document["tags" ]["XXX" ] natoms = document["nsites" ] nelements = document["nelements" ] volume = document["output" ]["structure" ]["lattice" ]["volume" ] volume_pa = volume / natoms
MongoDB 中的 atomate documet 数据无法直接全部写入到 json 文件中
其 key 和 dict 涉及到 str 均使用单引号
json 文件不识别 bool 变量?
1 '_id': ObjectId('62 dbb72c531c489b7a006879')
常见 workflow 的 document keys
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 dict_keys( [ "_id" , "dir_name" , "analysis" , "calcs_reversed" , "chemsys" , "completed_at" , "composition_reduced" , "composition_unit_cell" , "custodian" , "elements" , "formula_anonymous" , "formula_pretty" , "formula_reduced_abc" , "input" , "last_updated" , "nelements" , "nsites" , "orig_inputs" , "output" , "run_stats" , "schema" , "state" , "tags" , "task_id" , "task_label" , "transformations" , ] )
calcs_reversed
key 下的 keys (需添加 [0]
;含大部分同级下的 keys)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 dict_keys( [ "vasp_version" , "has_vasp_completed" , "nsites" , "elements" , "nelements" , "run_type" , "input" , "output" , "formula_pretty" , "composition_reduced" , "composition_unit_cell" , "formula_anonymous" , "formula_reduced_abc" , "dir_name" , "completed_at" , "task" , "output_file_paths" , "bader" , ] )
弹性常数计算 wf 会生成弹性性质分析 elasticity
collection
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 dict_keys( [ "_id" , "analysis" , "initial_structure" , "optimized_structure" , "tags" , "fitting_data" , "elastic_tensor" , "derived_properties" , "formula_pretty" , "fitting_method" , "order" , ] )
吉布斯自由能计算 wf 会生成 gibbs_tasks
collection
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 dict_keys( [ "_id" , "metadata" , "structure" , "formula_pretty" , "energies" , "volumes" , "pressure" , "poisson" , "mass" , "natoms" , "bulk_modulus" , "gibbs_free_energy" , "temperatures" , "optimum_volumes" , "debye_temperature" , "gruneisen_parameter" , "thermal_conductivity" , "anharmonic_contribution" , "success" , ] )
相关问题
1 2 raise ServerSelectionTimeoutError( pymongo.errors.ServerSelectionTimeoutError:
MongoDB 数据库连接失败(数据库服务未启动;XXX 指 Host)
1 2 3 4 5 getaddrinfo ENOTFOUND XXX Connection failed: XXXX: [Errno 111] Connection refused, Timeout: 30s, Topology Description: