Abstract A key challenge in understanding subcellular organization is quantifying interpretable measurements of intracellular structures with complex multi-piece morphologies in an objective, robust and generalizable manner. Here we introduce a morphology-appropriate representation learning framework that uses 3D rotation invariant autoencoders and point clouds. This framework is used to learn representations of complex multi-piece morphologies that are independent of orientation, compact, and easy to interpret. We apply our framework to intracellular structures with punctate morphologies (e.g. DNA replication foci) and polymorphic morphologies (e.g. nucleoli). We systematically compare our framework to image-based autoencoders across several intracellular structure datasets, including a synthetic dataset with pre-defined rules of organization. We explore the trade-offs in the performance of different models by performing multi-metric benchmarking across efficiency, generative capability, and representation expressivity metrics. We find that our framework, which embraces the underlying morphology of multi-piece structures, facilitates the unsupervised discovery of sub-clusters for each structure. We show how our approach can also be applied to phenotypic profiling using a dataset of nucleolar images following drug perturbations. We implement and provide all representation learning models using CytoDL, a python package for flexible and configurable deep learning experiments.