A Dual Channel Text Encoder for Solving Math Word Problems

XIAO Jing; HE Daijun; CAO Yang

doi:10.6054/j.jscnun.2023003

Journal of South China Normal University (Natural Science Edition) > 2023 > 55(1): 36-44. > DOI: 10.6054/j.jscnun.2023003

XIAO Jing, HE Daijun, CAO Yang. A Dual Channel Text Encoder for Solving Math Word Problems[J]. Journal of South China Normal University (Natural Science Edition), 2023, 55(1): 36-44. DOI: 10.6054/j.jscnun.2023003

Citation:

PDF (930 KB)

A Dual Channel Text Encoder for Solving Math Word Problems

School of Computer Science, South China Normal University, Guangzhou 510631, China

More Information

Received Date: June 01, 2022
Available Online: April 11, 2023

Graphical Abstract

Abstract

Abstract

In recent years, with the rapid development of artificial intelligence (AI) technology, researches on automatic solving of Math Word Problems (MWP) have been improved. In the task of automatically solving MWP, the modeling of the problem text is very important. For this issue, a Dual Channel Text Encoder (DCTE) based on Recursive Neural Network (RNN) and Transformer is proposed. DCTE firstly uses an RNN to initially encode the problem text, and then uses the Transformer based on the self-attention mechanism to obtain the long-range contextual information to enhance the representation of the word and problem text. Combining DCTE and GTS (Goal-Driven Tree-structured MWP Solver) decoder to obtain our math word problem solver DCTE-GTS. In this paper, DCTE-GTS is experimented on Math23k dataset and compared with Graph2Tree, HMS and other models. Meanwhile, the ablation experiments are also conducted to explore impact of encoder configuration on model's perfor-mance. The experimental results show that the proposed DCTE-GTS is better than the baseline models, obtaining an answer accuracy of 77.6%. The ablation experiments show that the configuration of DCTE is the best.
- artificial intelligence,
- math word problems,
- recursive neural network,
- self-attention,
- tree-structured decoder

FullText(HTML)

References (21)

References

[1]	ZHANG D X, WANG L, BING L M, et al. The gap of semantic parsing: a survey on automatic math word problem solvers[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2019, 42(9): 2287-2305.
[2]	YAN W, LIU X J, SHI S M. Deep neural solver for math word problems[C]//Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. Copenhagen: Association for Computational Linguistics, 2017: 845-854.
[3]	BOBROW D G. Natural language input for a computer problem-solving system[C]//Semantic Information Processing. Cambridge: MIT Press, 1968: 146-226.
[4]	SLAGLE J R. Experiments with a deductive question-answering program[J]. Communications of the ACM, 1965, 8(12): 792-798. doi: 10.1145/365691.365960
[5]	FLETCHER C R. Understanding and solving arithmetic word problems: a computer simulation[J]. Behavior Research Methods, Instruments & Computers, 1985, 17(5): 565-571.
[6]	BAKMAN Y. Robust understanding of word problems with extraneous information[J]. arXiv, (2007-01-14)[2022-05-20]. https://arxiv.org/abs/math/0701393.
[7]	KUSHMAN N, ARTZI Y, ZETTLEMOYER L, et al. Lear-ning to automatically solve algebra word problems[C]//Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics. Baltimore: Association for Computational Linguistics, 2014: 271-281.
[8]	ROY S, DAN R. Unit dependency graph and its application to arithmetic word problem solving[C] // Proceedings of the AAAI Conference on Artificial Intelligence. San Francisco: AAAI Press, 2017: 3082-3088.
[9]	SHI S M, WANG Y H, LIN C Y, et al. Automatically solving number word problems by semantic parsing and reasoning[C]//Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. Lisbon: Association for Computational Linguistics, 2015: 1132-1142.
[10]	HUANG D Q, SHI S M, LIN C Y, et al. Learning fine-grained expressions to solve math word problems[C]//Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. Copenhagen: Association for Computational Linguistics, 2017: 805-814.
[11]	WANG L, WANG Y, CAI D, et al. Translating a math word problem to a expression tree[C]//Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Brussels: Association for Computational Linguistics, 2018: 1064-1069.
[12]	LIU Q Y, GUAN W, LI S J, et al. Tree-structured decoding for solving math word problems[C]//Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing. Hong Kong: Association for Computational Linguistics, 2019: 2370-2379.
[13]	XIE Z P, SUN S C. A goal-driven tree-structured neural model for math word problems[C]//Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence. Macao: Morgan Kaufmann, 2019: 5299-5305.
[14]	ZHANG J P, WANG L, LEE R K, et al. Graph-to-tree learning for solving math word problems[C]//Procee-dings of the 58th Annual Meeting of the Association for Computational Linguistics. Seattle: Association for Computational Linguistics, 2020: 3928-3937.
[15]	CHO K, MERRIENBOER B V, GULCEHRE C, et al. Lear-ning phrase representations using RNN encoder-decoder for statistical machine translation[C]//Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing. Doha: Association for Computational Linguistics, 2014: 1724-1734.
[16]	HOCHREITER S, SCHMIDHUBER J. Long short-term memory[J]. Neural Computation, 1997, 9(8): 1735-1780. doi: 10.1162/neco.1997.9.8.1735
[17]	王红, 史金钏, 张志伟. 基于注意力机制的LSTM的语义关系抽取[J]. 计算机应用研究, 2018, 35(5): 1417-1420. https://www.cnki.com.cn/Article/CJFDTOTAL-JSYJ201805030.htm WANG H, SHI J C, ZHANG Z W. Text semantic relation extraction of LSTM based on attention mechanism[J]. Application Research of Computers, 2018, 35(5): 1417-1420. https://www.cnki.com.cn/Article/CJFDTOTAL-JSYJ201805030.htm
[18]	VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[C]//Advances in Neural Information Processing Systems 30. Cambridge: MIT Press, 2017: 6000-6010.
[19]	张小川, 戴旭尧, 刘璐, 等. 融合多头自注意力机制的中文短文本分类模型[J]. 计算机应用, 2020, 40(12): 3485-3489. https://www.cnki.com.cn/Article/CJFDTOTAL-JSJY202012013.htm ZHANG X C, DAI X Y, LIU L, et al. Chinese short text cla-ssification model with multi-head self-attention mechanism[J]. Journal of Computer Applications, 2020, 40(12): 3485-3489. https://www.cnki.com.cn/Article/CJFDTOTAL-JSJY202012013.htm
[20]	LI J R, WANG L, ZHANG J P, et al. Modeling intra-relation in math word problems with different functional multi-head attentions[C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Florence: Association for Computational Linguistics, 2019: 6162-6167.
[21]	LIN X, HUANG Z Y, ZHAO H K, et al. Hms: a hierarchical solver with dependency-enhanced understanding for math word problem[C]//Proceedings of the AAAI Confe-rence on Artificial Intelligence. Vancouver: AAAI Press, 2021: 4232-4240.

Cited By

Get Citation

PDF

XML

Article views (250) PDF downloads (130)

Turn off MathJax

Article Contents

Abstract

References

A Dual Channel Text Encoder for Solving Math Word Problems

Abstract

References

Catalog

Export File

Citation

Format

Content