This is the question I will answer in my final project. The elevator pitch is to transform the contents of a book into topological structure(s) (simplicial complexes) that can be compared to one another mathematically. From there it can determined if an author’s artistic tendencies are preserved in the transformation, and provided they are then we can guess with a certain accuracy on who the author is by looking at simplicial complexes generated by the book’s content.
In general I am interested using mathematical structure to understand seemingly unrelated objects like language. There are other approaches that I have experimented with in the past, like Latent Semantic Analysis which essentially uses singular value decomposition to derive semantic structure from documents. I am interested in comparing Topological Data Analysis’ results to other similar models for this task.
hduan2
February 19, 2018 — 19:45
Do you mean to develop an algorithm which can transfer the contents of a book into its topological structure then build the connection between the topological structure with author’s writing style? How you are going to quantize a “personal style”? Wish you can find the answers to your question and look forward to hearing it.
jhmuelle
February 19, 2018 — 20:48
I should have been more specific. This the python library is what I will use http://danifold.net/mapper/introduction.html
So I will not be inventing any new model or algorithm for this task. By “personal style” I meant the particular way an author writes. To quantify this directly is challenging, at least from our mathematical perspective. So I am assuming that each author has a somewhat unique style, and hoping that with the Mapper algorithm preserves whatever this “style” is and in way that is quantifiable.