α ILP: thinking visual scenes as differentiable logic programs

2Citations
Citations of this article
11Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Deep neural learning has shown remarkable performance at learning representations for visual object categorization. However, deep neural networks such as CNNs do not explicitly encode objects and relations among them. This limits their success on tasks that require a deep logical understanding of visual scenes, such as Kandinsky patterns and Bongard problems. To overcome these limitations, we introduce αILP, a novel differentiable inductive logic programming framework that learns to represent scenes as logic programs—intuitively, logical atoms correspond to objects, attributes, and relations, and clauses encode high-level scene information. αILP has an end-to-end reasoning architecture from visual inputs. Using it, αILP performs differentiable inductive logic programming on complex visual scenes, i.e., the logical rules are learned by gradient descent. Our extensive experiments on Kandinsky patterns and CLEVR-Hans benchmarks demonstrate the accuracy and efficiency of αILP in learning complex visual-logical concepts.

Cite

CITATION STYLE

APA

Shindo, H., Pfanschilling, V., Dhami, D. S., & Kersting, K. (2023). α ILP: thinking visual scenes as differentiable logic programs. Machine Learning, 112(5), 1465–1497. https://doi.org/10.1007/s10994-023-06320-1

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free