A method may obtain a plurality of tables of data, wherein the plurality of tables includes a plurality of cells arranged in rows and columns. A method may generate an output value for a cell of a first table with a transformer-based model, wherein the model includes a three-dimensional attention mechanism. A method may determine a first attention score across cells of the first table in the same row of the cell. A method may determine a second attention score across cells of the first table in the same column of the cell. A method may determine a third attention score across all rows of the first table. A method may determine a fourth attention score across embeddings from different tables of the plurality of tables. A method may calculate an embedding based on a combination of the attention scores. A method may determine the output value based on the embedding.
Full Text
What is claimed is: